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SIGNAL  ANALYSIS  TECHNIQUES  FOR  INTERPRETING  ELECTROENCEPHALOGRAMS 


INTRODUCTION 

Objective 

The  principal  objective  of  this  project  was  to  review  and  assemble  the 
relevant  signal-processing  literature  for  analyzing  and  interpreting  electro¬ 
encephalograms  (EEG),  especially  visual  evoked  responses  (VER).  In  the  course 
of  satisfying  this  objective,  it  was  necessary  to  consider  the  goals  of  the 
U.S.  Air  Force;  that  is,  to  consider  the  fundamental  U.S.  Air  Force  objectives 
which  led  to  this  study. 

The  principal  U.S.  Air  Force  objective  is  to  investigate  the  usefulness 
of  the  EEG/VER  as  a  tool  in  assessing  the  effects  of  visual  stimuli  upon  the 
flier,  especially  stimuli  which  could  lead  to  a  temporary  (or  permanent)  deg¬ 
radation  of  performance.  This  could  include  flashblindness,  disorientation, 
or  other  physiological  impairment  caused  by  visual  stimuli. 


Scope 

This  study  has  been,  by  necessity,  somewhat  limited  in  scope.  We  were 
concerned  with  assessing  the  state-of-the-art  in  digital  signal  processing  as 
it  relates  to  analysis  and  interpretation  of  the  ELG,  especially  the  VER.  The 
nature  of  the  EEG  signal  is  very  complex  and  for  th  s  reason  the  required  tools 
may  be  quite  sophisticated. 

The  study  has  concentrated  on  those  methods  suitable  tor  analysis  of  visual 
evoked  responses  rather  than  those  more  suited  for  analysis  of  the  spontaneous 
EEG.  Thus,  emphasis  is  not  placed  on  tracking  periodicities  such  as  alpha, 
beta,  or  theta  waves,  since  these  tend  to  be  missing  in  the  VER.  However, 
methods  of  modeling  the  spontaneous  EEG  have  been  considered  since  by  eliminat¬ 
ing  spontaneous  EEG  components,  the  VER  should  be  more  evident. 

In  the  course  of  this  study,  we  have  investigated  the  problem  of  measure¬ 
ment  variability  observed  after  signal  processing  In  particular,  we  have 
analyzed  the  current  processing  techniques  used  by  the  U.S.  Air  Force  to  de¬ 
termine  whether  the  observed  variability  might  be  caused  or  aggravated  by  the 
processing.  We  have  also  conducted  a  thorough  literature  search  to  look  for 
neurophysiological  evidence  of  variability  and  possible  cures.  Finally,  we 
have  reviewed  advanced  signal -processing  techniques  to  determine  their  poten¬ 
tial  for  reducing  variability. 

The  results  of  our  efforts  are  encouraging.  We  have  found  reason  to  be¬ 
lieve  that  some  improvement  is  possible,  although  the  magnitude  of  the  improve¬ 
ment.  and  the  recommended  processing  techniques  cannot  be  determined  without 
a  thorough  analysis  of  the  original  (preprocessed)  data.  We  suspect,  however, 
that  significant  improvements  may  not  be  possible  without  more  sophisticated 
processing  and  modified  experimental  practice  and  data  collection. 


General  VER  Signal  Characteristics  and  Processing  Implications 

The  characteristics  of  the  VER  signals  we  wish  to  analyze  are  not 
easily  described  using  simple  models.  An  example  of  an  idealized  VER  is 
shown  in  Figure  1.  The  signal  is  clearly  nonstationary  and  of  high  order. 
Furthermore,  the  VER  cannot  be  modeled  by  a  minimum-phase  system  since 
energy  in  the  VER  grows  with  time  during  the  initial  response.  This 
complicates  the  modeling  process. 


Figure  1.  Schematic  of  a  visual  evoked  potential, 
(after  Desmedt  (23,  p.  18)) 


The  VER  is  a  stochastic  process  and  any  realistic  analysis  must  include 
a  consideration  of  stochastic  effects.  An  example  of  a  set  of  averaged  VERs 
taken  from  the  same  human  subject,  but  at  different  times,  is  shown  in 
Figure  2.  The  variability  between  records  is  not  systematic  and  follows  no 
simple  pattern.  The  problem  is  made  more  difficult  by  the  fact  that  the  VER 
is  a  collective  process  arising  due  to  the  action  of  many,  variably  coupled, 
cellular  generators.  This  lack  of  determinism  (predictability)  in  the  VER 
suggests  that  stochastic  effects  are  significant  and  that  care  must  be  taken 
when  modeling  them. 

In  designing  VER  signal -processing  techniques,  it  is  essential  that  the 
following  points  be  kept  in  mind: 

a)  The  VER  signal  is  a  stochastic  process  with  a  significant  amount  of 
unpredictability  in  time. 

b)  The  VER  signal  is,  at  least  in  part,  a  nonstationary  process.  It 
is  important  that  analysis  techniques  explicitly  take  this  fact 
into  account. 

c)  The  information  we  seek  may  be  spread  over  more  than  one  electrode. 
Thus  multivariate  analysis  techniques  should  be  employed. 

d)  The  relationship  of  input  stimulus  to  output  response  we  are 
analyzing  will  be  nonlinear,  especially  the  saturation  (flash¬ 
blindness)  reaction. 
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Figure  2.  VER  variability.  Each  tracing  is  an  average  of  240 

successive  1-sec  segments  time-locked  to  a  checkerboard 
pattern  stimulus  at  a  reversal  rate  of  4  per  second. 
Data  are  for  the  human  subject  under  the  same 
controlled  conditions  during  the  same  day,  covering 
both  morning  and  afternoon.  Data  supplied  by  the 
U.S.  Air  Force. 


e)  The  corrupting  noises  we  wish  to  filter  out  may  be  partially 
correlated  over  several  adjacent  electrodes.  This  is  another 
argument  for  using  multivariate  signal  analysis  techniques. 

With  these  points  in  mind,  we  can  now  broadly  outline  an  overall 
approach  to  signal  processing: 

a)  We  '.i ay  think  of  the  EEG/VER  signals  as  outputs  of  a  system  which  we 
car,  model  phenomenologically.  That  is,  using  all  information  at 
hand,  ,;e  vr.ll  use  a  system  model  to  generate  the  evoked  potentials. 

b)  We  wish  to  take  advantage  of  the  known  physiological  character¬ 
istics  of  the  eye  and  brain,  insofar  as  they  suggest  particular 
model  structures. 

c)  We  will  employ  generic  models  which  can  be  used  to  explain  as  many 
observed  phenomena  as  possible. 

d)  We  will  take  advantage  of  recent  developments  in  the  fields  of 
system  modeling  and  identification,  time  series  analysis,  adaptive 
Kalman  filtering,  spectral  analysis,  and  pattern  recognition. 

e)  We  will  surest  experiment  design  methods  in  order  to  enhance 
model  identification  by  tailoring  stimulus  features  to  particular 
measurable  reaction. 


Overview 

This  report  begins  with  a  discussion  nf  the  signal -processing  approach 
presently  used  by  the  U.S.  Air  Force  for  VER  analysis,  namely  the  Fast 
Fourier  Transform  (FFT).  The  obser'ed  variability  in  the  periodogram  is 
explained  qualitatively  and  quantitatively  via  several  simple  signal  and 
noise  models.  Based  on  these  results  different  signal  processing 
techniques  based  on  FFT  analysis  are  suggested. 

The  physiological  aspects  of  VER  variability  are  discussed  in  the 
section  "Aspects  of  EEG/VER  Variability."  A  comprehensive  literature  search 
has  been  carried  out  in  order  to  relate  the  U.S.  Air  Force  problem  to  the 
work  of  other  research  groups  and  their  findings.  Several  mechanisms  for 
explaining  the  observed  variabilities  are  presented.  The  important  problem 
of  how  to  determine  an  appropriate  measure  of  VER  activity  is  discussed. 

The  section  "Improved  Techniques  for  VER  Analysis"  is  devoted  to  a 
discussion  of  alternate  signal -processing  techniques  which  are  appropriate 
for  VER  signal  processing.  These  include  recently  developed  analytical 
methods  and  modeling  approaches  as  well  as  more  classical  approaches.  They 
have  been  culled  from  a  comprehensive  review  of  the  EEG  literature,  as  well 
as  specific  Scientific  Systems,  Inc.,  experience  in  biological  signal 
processing. 

Our  conclusions  and  recommendations  are  given.  Appendixes  A  through  J 
are  sections  from  our  Interim  Report  that  summarized  our  survey  of 
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signal -processing  techniques  applicable  to  EEG/VER  analysis.  This  report 
began  with  a  review  of  the  physiological  background  of  tne  problem  and  the 
general  signal-processing  objectives  of  the  U.S.  Air  Fc'xe.  We  then 
discussed  a  number  of  signal-processing  methods  which  may  be  useful  in 
fulfilling  the  objectives.  We  concluded  with  a  discussion  of  how  these 
methods  might  be  applied  in  VER  experimentation. 


ANALYSIS  OF  CURRENT  PROCESSING 

This  section  discusses  some  of  the  current  signal-processing  problems 
facing  the  U.S.  Air  Force  researchers  and  outlines  potential  solutions  and 
analysis  techniques  for  reducing  signal  variability.  This  variability  in 
the  processed  data,  described  in  more  detail  below,  is  the  major  obstacle  to 
developing  accurate  visual  performance  measures.  We  believe  this  obstacle 
can  be  reduced,  if  not  removed. 

The  basic  questions  we  tried  to  answer  are:  is  the  large  amount  of 
variability  due  to  a  signal  characteristic  or  processing  technique,  and  can 
it  be  reduced  by  alternate  processing  methods?  Our  answer,  based  on  the 
available  data  and  explained  in  this  section,  is:  the  variability  is 
largely  consistent  with  a  noisy  measurement  model;  that  is,  the  variability 
is  probably  due  to  "noise"  in  the  measured  signal  and  is  not  a  processing 
artifact,  although  different  processing  methods  have  varying  sensitivities 
to  the  noise.  Thus,  the  effect  of  the  (signal)  variability  on  a  visual 
performance  measure  may  be  reduced  by  alternate  processing  techniques,  some 
of  which  are  quite  simple  and  fast.  In  order  to  recommend  a  specific 
processing  technique,  however,  an  analysis  of  the  raw  data  (measured  EEG 
signals,  with  and  without  stimulus)  is  necessary. 


Experiment  Purpose 

In  order  to  evaluate  potential  processing  techniques,  an  understanding 
of  the  purpose  of  the  experiments  is  necessary.  The  immediate  objective  of 
the  current  experiments  is  to  develop  a  measure  of  visual  system  (eye  and 
brain)  performance,  using  EEG  data,  that  is  accurate  enough  to  distinguish 
between  levels  of  visual  acuity  ranging  from  normal  sight  to  flash-induced 
blindness.  This  measure  must  be  computable  in  a  reasonable  amount  of  time 
(e.g.,  1  min  or  less)  in  order  to  permit  accurate  tracking  of  the  recovery 
for  temporary  flashblindness. 

The  achievement  of  this  objective  is  hampered  by  certain  experimental 
constraints  (imposed  in  order  to  make  the  results  relevant  to  the  U.S.  Air 
Force  mission),  such  as  narrow  fields-of-view  and  anesthetized  subjects, 
which  reduce  the  amplitude  of  the  VER  and  make  it  difficult  to  measure.  To 
date,  the  observed  variability  in  the  processed  data  is  sufficiently  large 
to  make  accurate  visual  performance  evaluation  extremely  difficult.  We 
would  like  to  examine  whether  this  observed  variability  is  due  to  the  poor 
signal  strength  or  other  factors. 

We  begin  by  describing  the  observed  variability  and  then  investigating 
the  processing  used  on  the  data.  Next  we  discuss  an  alternate  processing 


technique  and  compare  its  performance  to  that  of  the  current  method.  We  end 
this  section  with  a  summary  of  our  conclusions  to  date. 


Observed  Variability 

An  example  of  the  degree  of  variability  is  shown  in  Figure  3,  which  was 
made  using  actual  data  supplied  by  the  Air  Force.  The  figure  shows  the 
power  (in  undesignated  units)  of  an  evoked  potential  (left-hand  data)  and 
background  noise  (right-hand  data)  when  processed  in  a  particular  manner, 
described  in  the  next  section.  The  experiment  was  performed  on  an 
anesthetized  monkey,  and  the  points  have  been  reordered  for  this  plot.  Each 
point  represents  60  sec  worth  of  data,  and  the  noise  points  were  originally 
interspersed  between  signal  points.  Data  were  taken  for  approximately  5  hr, 
and  only  the  first  group--up  to  a  rest  period  at  40  min— is  shown.  The 
rest  of  the  data  was  qualitatively  similar  to  that  shown. 

The  experiment  conducted  was  of  the  "steady-state"  type,  discussed  in 
the  section  "Aspects  of  EEG/VER  Variability,"  where  the  stimulus  was  a 
sinusoidal  grate  pattern  which  reversed  4  times  per  second.  The  processing 
employed  tries  to  estimate  the  evoked  response  power  (at  4  Hz)  while 
suppressing  the  background  EEG.  The  same  processing  is  applied  to  each  data 
group--the  only  change  is  the  presence  or  absence  of  the  stimulus  pattern. 


Current  Signal  Processing 

In  order  to  examine  what  the  signal  (EEG)  characteristics  are,  we  need 
to  understand  what  the  current  processing  technique  does  to  the  data.  By 
inverting  the  processing  operations,  we  would  like  to  arrive  at  a  signal 
model  which  can  be  used  to  obtain  a  better  processor.  Such  an  inversion  is 
nearly  impossible  from  the  limited  processed  data  available,  of  course,  and 
a  complete  signal  model  must  await  the  analysis  of  the  original  raw  data. 
Nonetheless,  much  can  be  learned  from  simple  potential  models,  as  discussed 
below. 

The  basic  processing  approach  currently  used  is  a  hybrid  time-average 
and  Fourier  Transform  which  has  several  interesting  properties.  The  raw 
signal  (measured  EEG),  denoted  by  x(t),  is  first  averaged  at  1-sec  intervals, 
forming  x(t). 

1  N-1 

x(t)  =  rr  £  x(t+n)  t  e  (0,1  ) 

N  n=0 


Thus,  for  the  data  of  Figure  3,  60  sec  worth  of  raw  data  is  divided 
into  60,  1-sec  intervals  (each  with  4  evoked  responses)  carefully  aligned 
with  the  pattern  reversal  signal.  These  60  records  are  then  added  together 
and  divided  by  60  to  obtain  an  average  evoked  response  1  sec  long, 
containing  4  individual  average  evoked  responses.  This  average  signal  is 
then  passed  through  an  FFT  algorithm  (see  Appendix  C),  which  produces  a 
complex  sequence  equally  spaced  in  frequency  (with  1  Hz  resolution,  in  this 
case).  The  periodogram  (power  estimate  formed  with  the  magnitude  of  the  FFT 
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(left)  and  background  noise  (right)  power  levels. 


coefficients)  is  then  computed,  and 
tained.  This  power  estimate  is  the 
60-sec  record,  and  these  points  are 
cessing  sequence  is  shown  in  Figure 


the  power  component  at  4  Hz  is  ob- 
data  point  used  to  summarize  the 
the  ones  shown  in  Figure  3.  The  pro- 
4. 


Time  Average 


Time  Series 
x"(t) 


x(o>) 


Periodogram 
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(oj. 


l^k/TA2^k 


JLi£U 


Figure  4.  Processing  sequence  of  time  series. 


The  two-step  processing  (averaging,  then  taking  the  FFT)  should  reduce 
"noise"  (spontaneous  EEG  not  in  phase  with  the  stimulus  for  60  sec)  and 
enhance  the  typical  evoked  response.  The  averaging  process  will  favor 
patterns  that  repeat  at  multiples  of  1  Hz  for  the  full  60  sec  and  suppress 
those  signals,  such  as  the  background  EEG,  which  naturally  "wander"  in  phase. 
In  particular,  white  Gaussian  noise  will  have  its  variance  reduced  by  a 
factor  of  60;  that  is,  if 

x(t)  =  n(t) 

E[n(t)]  =  0 

E[n(t)n(T)]  =  Q6(t-t) 

wh®  e  6  is  the  Dirac  delta  function,  and  Q  is  the  height  of  the  noise 
spectrum,  then 

E[x(t»  •  0 

E[x(t)x(x)]  »  jj}  «(t-x)  (1) 

Pure  sinusoids  at  multiples  of  1  Hz,  however,  will  pass  through  the  averages 
unaffected  ( i . e . ,  with  full  power).  Specifically,  if 

x(t)  =  /2P  sin  u)Qt  t  e  (0,T) 


then 


T,  \  -  pT 

for  the  scale  factors  used  in  our  analysis.  The  application  of  the  power 
estimator  when  more  complex  signals  are  present  is  not  as  straightforward, 
however. 
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FFT  Operation 


The  FFT  function  often  used  to  obtain  power  estimates  is  called  the 
oeriodogram,  as  discussed  in  Appendix  C.  When  pure  sinusoids  are  processed 
in  a  periodogram,  good  power  estimates  are  obtained,  as  discussed  above. 

When  noisy  signals  are  processed  by  a  single  periodogram,  however,  very  poor 
spectral  estimates  are  obtained,  with  the  error  in  the  estimate  often  as 
large  as  the  true  power  level.  In  particular,  for  a  Gaussian  spectrum,  the 
standard  deviation  of  the  estimate  is  equal  to  the  true  spectrum  height. 
Moreover,  when  both  a  pure  sinusoid  and  a  Gaussian  process  are  observed,  the 
estimation  error  (at  the  sinusoid's  frequency)  can  be  much  larger  than  the 
Gaussian  process  alone  might  suggest. 

Consider  a  signal  of  the  form 

z(t)  =  /?P  sin  ojQt  +  n(t) 


where 


E[n(t)n(t )]  =  Q  6(t-x ) 

If  n(t)  is  observed  alone  and  an  FFT  (periodogram)  power  estimate  obtained 
from  it,  the  average  power  at  each  frequency  would  be 

SU)  =  Q 

p 

The  variance  of  each  power  estimate  would  be  Q  .  This  results  from  the  fact 
that  the  elements  of  the  periodogram  are  Chi -squared  random  variables  with 
2  degrees  of  freedom.  If  the  pure  sinusoid  were  observed  alone  for  T 
seconds,  the  power  estimate  at  u>Q  would  be 

Mwq)  — 2 

with  zero  error,  as  discussed  above. 


The  question  arises:  if  both  components  of  z(t)  are  observed,  what  is 
the  mean  and  variance  of  the  spectral  estimates? 


To  answer  this  question,  we  first  note  that  the  FFT  is  a  linear 
operation  (before  the  periodogram  is  computed)  and  that  therefore  the 
Fouiier  coefficients  are  normal  random  variables.  Thus,  for  the  signal 
above,  we  may  model  the  Fourier  components  as* 


X  2 

x~N(m,o  )  implies  that  x  is  a  normal  (Gaussian)  random  variable  with  mean  m 

O 

and  variance  o  .  J~J  =  E (  )  is  used  to  denote  the  mean  of  the  quantity  (  ). 
Var(  )  is  used  to  denote  variance.  The  mean  of  the  x^(wq)  components 

depends  on  the  phase  angle  of  the  sinusoid,  and  we  have  assumed  a  45°  angle 
here.  The  spectral  estimate  s  is,  by  design,  independent  of  the  phase. 


N(0,  Q/2 ) ;  Uj  f  o)0  i  =1,2 

xi (“j )~  ’ 

N(  \  m,  Q/2);  ^=0)0 
and  the  spectral  estimate  is 

sUj)  =  x2(uij)  +  x^j) 

Thus,  if  Q  =  0 

v/  x  _  PT 
x'a)0'  — ? 

and  if  P  =  0 


s(w.)  =  Q 

var  x{u-)  =  Q 

J 

which  agrees  with  our  previous  result. 
When  Q  and  P  are  both  not  zero, 


s(w0)  =  +  Q  (2) 

Var  sUQ)  =  Q2  +  PTQ  (3) 

as  derived  in  Appendix  K. 

Thus,  the  mean  spectral  estimate  is  precisely  the  sum  of  the  sinusoidal 
and  white  noise  means.  The  variance,  however,  is  larger  than  the  white 
noise  variance,  and  includes  a  cross-product  (PTQ)  that  results  from  the 
squaring  operation  used  for  the  power  estimate.  In  the  case  of  the  data 
shown  in  Figure  3,  the  cross  product  term  may  be  much  larger  than  the  white 
noise  term  (Q^). 

If  the  noise  process  is  not  white  but  colored  (correlated),  the  results 
are  similar.  Indeed,  if  the  noise  spectrum  is  essentially  flat  over  a 
bandwidth  enclosing  the  sinusoid  and  as  wide  as  the  FFT  resolution,  the 
white  noise  model  may  be  used.  In  the  present  case,  if  the  FFT  has 
frequency  samples  1  Hz  apart  and  the  background  spectrum  is  nearly  flat  from 
3  to  5  Hz,  then  a  white  noise  model  may  be  used  for  examining  the  4-Hz  power 
component.  Thus,  the  two  noise  spectra  shown  in  Figure  5  have  essentially 
the  same  effect  on  estimating  the  4-Hz'  power  component  in  an  FFT.  This 
observation  may  be  used  to  justify  a  white  noise  model  for  much  of  our 
current  analysis,  where  very  little  is  known  about  the  signal  away  from  4  Hz. 


<■  ' 
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Figure  5.  White  and  colored  noise  indistinguishable  near  4  Hz. 


Current  U.S.  Air  Force  Processing 

With  the  above  results,  the  performance  of  the  current  U.S.  Air  Force 
processing  technique  may  be  analyzed  for  simple  signal  models.  The  current 
technique  can  be  divided  into  two  steps:  time  series  averaging  and 
periodogram  power  estimation.  As  discussed  above,  the  averaging  step 
reduces  the  noise  {broadband  background  EEG,  measurement  noise,  etc.) 
variance  by  a  factor  of  60.  The  periodogram  then  produces  a  somewhat  noisy 
power  estimate  from  the  averaged  signal. 

Specifically,  we  consider  a  signal  composed  of  a  pure  sinusoid  and 
white  noise,  i .e. , 

z(t)  =  m  sin  Wgt  +  n(t) 

as  above.  The  current  technique  computes  the  time  average 
1  N 

z(t)  =  tt  l  z(t+n)  t  e  (0,1 ) 

N  n=l 

The  result  of  this  averaging  is  to  create  a  signal  of  the  form 


z(t)  =  m  sin  u>t  +  n(t) 


where 


E[n(t)n(t)]  •  3  s(t-0 


using  the  results  of  Eq.  1.  Thus,  the  performance  of  the  current  estimator 
is: 

Mean  s(wQ)  *  ^  +  J-  Q  (4) 

Variance  var(s(u)n))  =  ~  02  +  i  QPT  (5) 

u  Nz  w 

These  results  are  shown  in  Figures  6  through  8. 
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Power 


4  Hz 

Figure  6.  Original  spectrum. 


Figure  7.  Spectrum  of  average  time  series. 


Figure  8.  Variability  of  single  periodogram  (+c). 


Modeling  Implications  of  Data 

The  sample  statistics  (mean  and  standard  deviation)  of  the  data  shown 
in  Figure  3  (18  points)  are  given  in  Table  1.  Also  shown  in  the  table  are 
the  statistics  for  the  full  (5-hr,  84  and  85  points)  data  sequence  from  that 
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experiment.  The  samples  are  reasonably  close,  so  that  a  fairly  high 
confidence  can  be  placed  in  them. 


TABLE  1 .  EXPERIMENT  STATISTICS 


Stimulus  on 

Stimulus  off 

18  pts. 

85  pts. 

18  pts. 

84  pts. 

Mean 

5.56 

4.41 

1.42 

1.52 

Standard 

deviation 

1.94 

1.69 

0.74 

0.85 

We  note  in  particular  that  the  standard  deviation-to-mean  (a/m)  ratio 
for  the  background  noise  (stimulus  off)  case  is  approximately  1/2.  A 
Chi-squared  statistic,  which  would  result  from  normally  distributed  samples  „ 
(x.j(<i>))  being  squared  and  summed  (in  the  periodogram) ,  would  have  a  ratio  of^ 

a  _  2 


where  n  is  the  number  of  terms  being  summed.  For  n  =  2,  corresponding  to 
the  usual  spectral  estimate  (real  and  complex  parts  of  the  FFT), 


Thus,  we  see  that  there  is  less  variability  (proportionately  lower  o) 
in  the  stimulus-off  case  than  would  be  present  in  a  white  noise  (or 
broadband  noise)  spectrum.  This  discrepancy,  although  not  overly  signif¬ 
icant,  indicates  that  a  white  noise  model  is  relatively  conservative  (more 
variability  than  actually  observed)  and  that  time-varying  spectra  (e.g., 
from  nonstationary  signals)  may  not  be  needed. 

If  we  believe  that  the  stimulus  creates  a  highly  correlated  (nearly 
deterministic)  response  in  the  EEG,  then  the  observed  signal  might  resemble 
a  sinusoid  (at  the  reversal  rate--4  Hz)  plus  the  background  noise.  Using 
noise  values  near  the  background  noise  levels  (0.5  to  1)  and  a  sinusoid 
power  level  (P)  of  9  (to  produce  spectral  heights  on  the  order  of  5  for  T=1 
sec),  we  see  in  Table  2  that  the  resulting  mean  and  standard  deviation 
(calculated  using  Eqs.  2  and  3)  are  close  to  those  seen  in  Table  1. 

Once  again,  the  discrepancy  between  the  simple  model  predictions  and 
the  observed  values  is  not  great,  and  indeed  the  model  values  are  more 
pessimistic  (higher  a)  about  estimating  s  than  the  observed  data  indicate. 

We  also  see  that  adding  a  pure  sinusoid  to  low  measurement  noise  results  in 
a  much  larger  spread  in  the  spectral  estimate  than  one  might  think.  This  is 


2 


In  this  case,  a 


2 


would  equal  Q/60  where  Q  was  the  original 


noise  level . 
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a  normal  consequence  of  using  a  single  periodogram  for  spectral  estimation. 
The  next  section  discusses  one  of  the  traditional  ways  of  reducing  this 
variability. 


TABLE  2.  THEORETICAL  STATISTICS 


Stimulus  on 

Stimulus  off 

X 

s 

0 

s 

a 

5 

2.18 

.5 

.5 

5.2 

2.61 

.7 

.7 

5.5 

3.16 

1 

1 

Classic  Spectral  Estimation 

One  of  the  standard  techniques  in  spectral  estimation  is  a  slight  variation 
on  the  current  U.S.  Air  Force  approach  of  computing  the  periodogram  of  a  time 
series  average.  The  classic  technique  computes  a  periodogram  for  each  window  in 
the  total  record  length  and  then  averages  the  periodograms  to  obtain  a  power 
spectrum  estimate.  This  averaging  reduces  the  error  in  the  spectral  estimate 
although  it  does  not  reduce  the  noise  level  in  the  signal.  This  is  a  fundamentally 
different  result  from  that  of  the  current  U.S.  Air  Force  processing:  the  classic 
technique  tries  to  estimate  the  complete  spectrum  (signal  plus  noise),  while  the 
current  approach  tries  to  reduce  the  noise  and  then  estimate  the  sinusoidal  power. 

The  classic  approach  gained  favor  because  of  the  severe  sensitivity  of 
a  single  periodogram  to  noise — even  when  the  noise  power  is  low.  This 
sensitivity  was  discussed  above,  where  it  was  shown  that  for  white  noise, 
the  standard  deviation  of  a  single  periodogram  was  as  large  as  the  mean 
value  of  the  spectrum  being  estimated.  The  classic  approach  averages  N 
periodograms  to  obtain  a  /ft  reduction  in  standard  deviation,  while  the 
average  value  converges  to  the  actual  power  spectrum  (sinusoid  plus  noise). 

Specifically,  we  consider  the  N  spectral  estimates 

0  O 

s  (io«)  +  x-j  (wn)  +  x2  (wq)  n  -  1,...,N 
'  n  ri 

from  the  N  windows: 

z(t+n) ,  t  e  (0,1 ) ,  n  =  1 , . . .  ,N 
For  each  window 

x •  (wq)  N(m,o  ) 
n 


16 


where 


m  =  \  ffi 
2  _  Q 

a  -j 


and  then 


*  1  n  a 

=  N  z,sn^0^ 
n=  l 


is  the  average  spectral  estimate.  The  mean  and  variance  of  s,  as  computed 
in  Appendix  K,  are: 


and 


*  PT 

s(wq)  =  ~2  +  Q 


v,r  sU)  =  SlVia 


(6) 


(7) 


This  mean  and  standard  deviation  are  shown  in  Figure  9. 
Power 


T+  <H- 


q2+ptq 
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Figure  9.  Classic  estimator  performance  (+a). 


The  classic  technique  is  a  somewhat  ad  hoc  method  which  has  proved  to 
be  very  useful  in  unknown-siqnal  applications.  The  method  can  be  tuned  to  a 
particular  problem  by  varying  the  window  width,  using  window  weighting 
functions,  overlapping  windows,  or  smoothing  the  frequency  estimates  as 
discussed  in  Appendix  C. 


Comparison  of  Approaches 

In  order  to  demonstrate  the  difference  in  these  signal  processing 
methods,  we  consider  a  sinusoid  plus  white  noise  model  as  discussed  above, 
with  the  parameters  adjusted  to  produce  results  similar  to  those  of  Figure  3. 
For  clarity,  the  stimulus  on  (sinusoid  plus  noise)  and  stimulus  off  (noise 
only)  simulation  results  are  plotted  separately  in  Figures  10  and  11.  The 
first  10  points  represent  10,  60-sec  data  records  of  signal  plus  noise. 
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Figure  10.  Classic  estimator  (simulation). 
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Figure  11.  Current  estimator  (simulation). 


followed  by  3  min  of  quiet,  and  10,  60-sec  noise-only  runs.  Figure  10  shows 
the  performance  of  the  classic  approach,  and  Figure  11  shows  the  current 
method.  The  same  data  were  fed  to  both  processors,  and  the  simulation 
parameters  were  P  =  9,  Q  =  30,  N  =  60  (60,  1-sec  intervals  per  point  for 
both  techniques).  The  actual  mean  and  standard  deviation  of  the  simulations, 
along  with  the  predictions  from  Eqs.  4  through  7  ,  are  shown  in  Table  3. 


TABLE  3.  COMPARISON  OF  SPECTRAL  ESTIMATORS 


Stimulus  on, 

Stimulus  off 

mean 

a 

mean 

a 

Simulated 

34.54 

4.30 

30.59 

4.33 

Classic 

Predicted 

34.5 

4.42 

30.0 

3.87 

Simulated 

4.60 

2.68 

0.65 

0.48 

Current 

Predicted 

5.0 

2.18 

0.5 

0.5 

These  results  are  quite  revealing.  The  results  using  the  current 
approach  (Figure  11)  demonstrate  that  merely  adding  a  sinusoid  to  broadband 
noise  drastically  increases  the  spread  of  the  data  points  in  addition  to 
raising  the  mean.  The  numbers  are  generally  similar  to  those  of  Figure  3, 
as  desired.  The  classic  approach,  on  the  other  hand,  has  a  much  lower 
percentage  variability,  although  the  separation  between  signal  plus  noise 
and  noise  is  not  (proportionately)  as  great.  Appendix  K  shows  that  the 
percentage  variability  (a/m)  for  the  classic  method  is  always  lower  than 
that  of  the  current  processing  scheme  for  a  simple  signal  plus  noise  model. 
Whether  this  result  is  useful  to  the  U.S.  Air  Force  depends  on  the  true 
signal  characteristics. 

The  amount  of  noise  in  these  simulations  is  quite  large.  Before 
averaging,  the  sinusoid  (P  =  9)  power  level  is  only  4.5  units  above  the 
background  noise  level  of  30  (SNR  =  0.15).  If  the  noise--due  to  background 
EEG,  measurement  mnce.  processing  errors,  or  physiological  artifacts--is 
actually  this  bad  or  the  signal  this  small,  most  processing  schemes  will  be 
hard-pressed  to  drastically  imnrove  on  these  results.  In  order  to  determine 
whether  substantial  improvement  is  possible,  we  believe  the  raw  data  should 
be  analyzed  in  detail. 

We  note  in  passing  that  the  increased  variability  shown  in  t^ 
stimulus-on  results  appears  due  to  the  "cross  term"  PTQ  (or  PTO/N;  the 
variance  formulas.  This  term  is  a  result  of  the  squaring  operatic  in  the 
power  estimates.  If  the  signal  shape  is  known  (or  easily  approximated)  and 
the  phase  lag  (latency)  is  relatively  constant,  it  may  be  possible  to 
linearly  filter  the  signals,  thus  avoiding  the  interference  cross  term.  For 
exampleT  for  the  sinusoid  plus  noise  signal  above,  if  we  form 

'•  1  fT 

A  =  j  z(t)  sin  me  dt 
1  j0 
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The  mean  of  A  is 
A  =  /P77 
and 

var  A  =  ^ 

which  is  independent  of  P.  For  the  parameters  used  in  the  simulation, 

A  =471  =  2.12 
var  A  =  /30/TZTT  =  0.5 

The  degree  of  noise  attenuation  with  this  approach  depends  on  the  total 
integration  time  (60  sec  in  this  case)  which  is  the  reciprocal  of  the 
bandwidth  used.  A  longer  averaging  time  corresponds  to  a  narrower  bandwidth 
for  the  noise  to  pass  through.  In  the  U.S.  Air  Force  case,  it  may  be 
possible  to  rearrange  experiments  to  permit  longer  data  lengths  before 
stimulus-off  readings.  (If  an  accurate  estimate  of  P  is  obtained,  the 
stimulus-off  readings  might  reduce  to  periodic  baseline  checks.)  If  1-min 
response  Is  needed  (or  even  faster  for  the  flash  recovery  tests),  a  sliding 
average  may  be  used. 

Finally,  we  note  that  the  average  of  several  logs  of  periodograms  are 
sometimes  used  to  estimate  the  log  of  the  spectrumTihis  estimator  has 
somewhat  different  properties  (the  variance  becomes  independent  of  the 
spectrum)  which  make  it  attractive  in  some  circumstances.  The  scatter  of 
the  estimates  is  proportionately  reduced,  although  the  sensitivity  to 
amplitude  changes  is  also  lowered. 


Remarks 

This  analysis  has  described  what  the  FFT  (and  periodogram)  does  in 
power  spectrum  estimation  and  how  it  is  affected  by  noisy  input  signals. 
Using  this  knowledge,  a  simple  signal  model  was  constructed  which  qualita¬ 
tively  duplicates  the  experimental  results  at  a  single  frequency  (4  Hz). 
Alternate  signal  processing  schemes  which  reduce  the  observed  variability 
were  then  discussed. 

These  alternate  techniques  estimate  different  parameters  than  those  of 
the  current  method,  and  the  usefulness  of  any  of  these  alternates  can  only 
be  judged  by  considering  the  relevance  of  the  estimated  par imeter  as  well  as 
the  accuracy  achieved  in  estimating  it.  Thus,  if  the  classic  method 
(averaging  periodograms)  reduces  variability  (by  better  estimating  the 
signal  plus  noise  power)  but  does  not  help  distinguish  between  two  low-level 
signals  of  nearly  the  same  power,  it  may  not  be  useful  in  quantifying  visual 
performance.  Alternatively,  by  estimating  all  of  the  power  near  4  Hz,  the 
classic  approach  may  find  a  signal  component  that  was  filtered  out  (e.g., 
due  to  phase  jitter  in  the  evoked  response)  in  the  current  method,  thereby 
reducing  variability  and  improving  sensitivity. 
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Tc  determine  what  signals  are  really  present,  and  what  the  best  ways  to 
estimate  them  are,  a  thorough  analysis  of  the  raw  data  (measured  EEGs  from 
many  electrodes,  as  currently  recorded)  is  necessary.  Some  of  the  tests 
that  we  recommend  performing  on  this  data  are  listed  below. 

First,  the  classic  spectral  estimation  technique  should  be  used  to 
track  the  propagation  of  the  full  spectrum  with  time.  (Other  analysis  tools, 
in  addition  to  spectral  techniques,  may  be  needed  if  significant  nonstation- 
arities  are  noticed.)  The  changes  in  the  spectrum  between  stimulus  on  arc! 
off  should  be  noted.  The  response  peaks  (to  the  stimulus)  should  be 
examined  to  determine  if  they  are  broad  enough  to  be  measured  by  the  classic 
technique  but  not  by  the  current  approach. 

Examining  the  full  spectrum  will  also  determine  whether  any  extraneous 
large  peaks  are  present  to  corrupt  the  data  and  result  in  poor  FFT  scaling. 
Fixed-point  FFT  routines  usually  scale  the  signal  to  reduce  error,  which 
tends  to  be  of  a  constant  magnitude  independent  of  the  signal  (i.e.,  if  the 
signal  weren't  scaled,  the  maximum  available  signal -to-processing  noise 
ratio  would  not  be  obtained.)  A  large  peak  away  from  4  Hz  might  therefore 
govern  the  scaling  operation,  leaving  the  4  Hz  component  with  more  noise 
than  necessary.  This  may  be  cured  by  simply  filtering  the  EEG  around  4  Hz. 

Also,  we  believe  it  is  important  to  examine  all  the  recorded  electrodes 
for  useful  information.  One  channel  will  undoubtedly  have  most  of  the 
response  power,  but  the  other  channels  may  be  useful  in  obtaining  an 
accurate  EEG  (background)  signal  which  could  be  used  to  improve  the  VER 
resolution  (e.g.,  by  subtracting  the  background  from  the  VER-pl us-background 
channel  before  regular  processing). 

These  tests,  and  others  that  may  appear  appropriate  after  an  initial 
investigation,  can  all  be  performed  on  recorded  data,  and  do  not  require 
special  new  experiments. 


Conclusions 

On  the  basis  of  analysis  of  the  available  data,  we  can  conclude  that 
the  observed  periodogram  variability  is  consistent  with  FFT  processing  of  a 
simple  noisy  measurement  model.  Therefore,  it  is  entirely  plausible  that 
the  observed  variability  is  due  to  two  principal  factors:  (1)  a  high  "noise" 
level  in  the  signal--most  probably  the  spontaneous  EEG,  and  (2)  the  extreme 
sensitivity  of  a  single  periodogram  to  the  residual  noise  present  after 
time-series  averaging.  In  particular,  the  observed  increased  variability 
when  the  stimulus  is  present  is  probably  caused  by  the  nonlinearity  in  the 
processing  technique  (squaring  signal  and  noise)  rather  than  by  an  increase 
in  the  noise  level . 

We  also  believe  that  the  variability  in  the  processed  signal  may  be 
reduced  (i.e.,  the  effect  of  the  noise  on  a  visual  performance  measure  may 
be  reduced)  by  alternate  processing  techniques.  Two  of  the  techniques 
discussed  were:  a  simple  modification  to  the  current  aporoach--altering  the 
order  of  FFT  and  averaging  operations--and  a  simple  linear  filter  to 
estimate  the  VER  amplitude.  Selection  of  the  most  useful  processing 
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cechnique  {or  simply  optimal  tuninq  of  the  available  techniques)  must, 
however,  await  an  analysis  of  the  original  measured  data.  Only  then  can  an 
appropriate  signal  model  and  related  processing  scheme  be  chosen. 

A  thorough  analysis  of  the  raw  data  may  reveal  that  our  s.mple  model  is 
valid  but  that  the  signal  to  noise  ratio  ("signal"  and  "noise"  as  defined 
earlier)  is  simply  too  low  to  permit  sufficient  reduction  o£  variability  for 
the  U.S.  Air  Force  purposes.  In  this  case,  new  informati.i  (e.g.,  through 
more  electrodes)  or  revised  experimental  procedure  may  Le:  me  necessary. 

The  data  analysis  may  also  reveal  that  a  more  complex  signal  model,  and 
sophisticated  processing  techniques,  are  required.  In  this  event,  improved 
performance  may  be  obtained  at  the  cost  of  some  additional  processing 
complexity.  These  issues  are  addressed  in  the  next  two  sections. 


ASPECTS  OF  EEG/VER  VARIABILITY 

In  our  Interim  Report  we  developed  the  concept  of  an  Eye-Brain- 
Electrode  model  (see  also  Appendix  B).  From  information  we  obtained  to  date 
from  the  U.S.  Air  Force,  it  appears  to  us  that  a  considerable  effort  is  made 
by  the  U.S.  Air  Force  to  control  for  variability  due  to  the  experimental 
conditions  and  eye  (especially  for  the  monocular  preparation).  Thus,  we 
will  mainly  address  the  variability  due  to  the  brain,  discuss  some  appar¬ 
ently  new  aspects  of  the  brain-electrode  transmission,  and  aspects  of 
improving  the  utilization  of  electrodes. 

Roughly,  the  concept  of  variability  of  EEG  and  VER  arose  from  experi¬ 
ments  in  which  many  factors  are  unknown  and  their  sombined  effect  on 
measurements  appears  random.  But  even  when  many  experimental  variables  are 
controlled,  successive  measurements  may  differ  in  quality  or  in  quantity. 

One  attempts,  of  course,  to  control  for  as  many  factors  as  possible,  but  the 
number  of  potential  factors  in  living  systems  is  rather  overwhelming.  But 
even  when  the  effect  of  some  of  the  important  factors  is  known  individually 
for  each  factor,  one  can  often  not  predict  the  effect  of  several  simulta¬ 
neously  acting  factors.  All  of  the  resulting  changes  in  experimental 
outcomes  may  loosely  be  regarded  as  experimental  variability. 

Often  a  quantitative  stochastic  point  of  view  is  adopted  in  order  to 
describe  variability.  This  concept  lends  itself  to  a  further  subdivision  of 
variability  into  variability  due  to  sampling  fluctuation  (variability  at 
fixed  experimental  conditions)  and  changes  that  are  of  a  systematic  nature 
such  as  adaptation  or  fatigue. 

When  quantifying  variability,  one  should  always  be  aware  that  the  term 
is  relative  and  the  significance  of  a  particular  form  of  variability  only 
attains  relevance  when  predictions  are  formed.  For  example,  variability  of 
scalp  potentials  can  be  used  to  assess  lateralization  of  the  brain  (Rebert 
and  Low  (84);  Pfurtschel ler  et  al.  (81);  Beaumont  et  al.  (10)).  In  what 
follows  we  will  not  stress  any  particular  measure  of  variability,  but 
indicate  in  which  sense  a  particular  investigator  perceived  variability  to 
be  important. 

In  this  section,  we  will  first  discuss  the  variability  of  EEG  and  VER 
under  presumably  fixed  experimental  conditions.  Subsequently  we  will 


briefly  discuss  some  anatomical  and  neurophysiological  principles  which  have 
to  be  considered  for  the  derivation  of  "good"  measures  (estimators)  of  VER. 


General  Considerations  for  Assessing  Variability  of  Responses 

For  the  purpose  of  quantifying  responses  as  they  are  expressed  in  the 
EEG,  four  methodologies  stand  out: 

1.  Signal  transfer  (modeling  characterization) 

2.  Steady-state  responses  (these  often  lead  to  signal  transfer 
modeling) 

3.  (Single)  response  waveform  analysis  in  space  or  in  time  (often  one 
uses  random  stimulus  intervals  and  averages  waveform) 

4.  The  use  of  "a  measure"  of  EEG  activity  such  as  power  spectral 
density  estimation. 

The  first  approach  is  a  typical  engineering  approach  and  is  geared  for 
the  prediction  of  an  output  (the  response)  given  some  input  (the  stimulus). 
In  some  instances  the  converse  statement  is  also  true,  namely  that  the  input 
can  be  estimated  from  the  output.  In  principle  such  an  analysis  might 
appear  to  be  the  most  attractive  one.  For  practical  matters,  however,  the 
method  is  only  useful  when  fairly  simple  structures  are  analyzed  or  the  set 
of  possible  inputs  (stimuli)  is  small.  The  high  complexity  of  living 
structures  can  set  limits  to  the  identifiability  (e.g.,  too  many  variable 
components)  of  a  given  structure  when  only  a  "black  box"  approach  is  taken. 
Also,  in  order  to  construct  a  quantitative  model  for  a  particular  input- 
output  relation,  much  data  has  usually  to  be  analyzed.  For  work  in  the 
direction  of  signal  transfer  modeling,  see  Desmedt  (28). 

The  second  approach,  steady-state  VER  analysis,  is  in  many  cases  an 
investigative  stage  prior  to  the  above  transfer-modeling  approach.  In 
steady-state  VER  analysis,  a  stimulus  is  applied  periodically  and  the 
(usually)  resulting  periodic  response  is  extracted.  Typically  amplitude  and 
phase  relation  (related  to  latency)  to  the  stimulus  are  studied  at  the 
fundamental  frequency  (e.g.,  reversal  rate)  and  its  harmonics.  When 
hardware  correlation  filters  are  used  for  this  purpose  they  may  have 
adjustable  bandwidth  from  about  1  Hz  down  to  .001  Hz  (Regan  (85,86)).  The 
narrow  bandwidth  is  equivalent  to  taking  a  long  averaging  window  and 
typically  reduces  "noise"  or  variability,  but  results  on  the  other  hand  in 
slow  tracking  of  possible  true  changes  of  a  response.  This  method  of 
analysis  is  used  as  a  diagnostic  tool  in  medical  practice. 

Single  response  waveform  analysis  is  often  done  by  averaging  individual 
responses,  all  separated  by  large  time  intervals.  Stimulation  is  done 
either  in  a  periodic  fashion,  or  preferentially  with  random  time  intervals, 
between  successive  stimuli.  Using  randomized  stimulus  times  is  conceptually 
comparable  to  drawing  randomized  samples,  a  typical  approach  in  statistical 
analysis.  Such  an  approach  is  aimed  at  reducing  variability  (or  cost)  of 
decisions  based  on  the  sample  (such  as  decision  regarding  the  visual 
performance) . 
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One  may  categorize  a  fourth  approach  in  the  evaluation  of  the  response 
of  the  EEG  to  a  stimulus.  This  approach  is  characterized  by  the  use  of  "a 
measure"  of  EEG  activity.  The  change  of  the  measure  in  response  to  a 
stimulus  is  studied.  The  choice  of  a  particular  measure  reflects  the 
experience  of  the  investigator  or,  in  some  instances,  the  properties  of  an 
algorithm  (Sencaj  et  al.  (101)).  Frequently  used  measures  are  the  power 
spectral  density  at  specific  frequencies  or  energy  within  a  frequency  band 
and  the  autocorrelation  (or  cross-correlation)  at  selected  lags.  When 
spatial  distribution  of  EEG  activity  is  measured,  spatial  spectra  and 
correlation  are  used  (Adey  and  Walter  (3),  Nunez  (74)).  The  use  of  such 
measures  must  usually  be  regarded  as  somewhat  nonspecific.  Lack  of 
well-defined  objectives,  insufficient  understanding  of  underlying  mechanisms 
but  well -understood  properties  of  above-mentioned  (noncommittal)  measures 
explain  their  preferential  use. 

In  what  follows  we  will  consider  evidence  for  the  variability  of  any  of 
the  above-mentioned  measures  as  necessary.  The  discussion  of  several 
measures  relating  to  variability  results  directly  from  the  U.S.  Air  Force 
objective  to  reduce  variability  in  their  measures:  the  goal  to  derive  such 
measures  (with  respect  to  particular  U.S.  Air  Force  objectives)  can  only  be 
accomplished  when  the  properties  and  contributions  of  several  ongoing 
physiological  and  physical  mechanisms  are  clarified. 


Variability  of  the  VER 

In  the  last  decade  the  variability  of  the  VER  has  become  of  increasing 
concern,  especially  in  the  context  of  establishing  confidence  in  averaged 
VERs.  Physiological  variability  as  opposed  to  sampling  fluctuations  are 
often  described  in  more  specific  terms  such  as  adaptation,  conditioning 
effects,  habituation  and  dishabituation,  sustained  and  transient  responses, 
and  response  plasticity.  To  understand  this  physiological  variability  of 
the  VERs,  it  was  found  important  to  investigate  responses  in  relation  to 
other  ongoing  brain  activity  such  as  various  rhythms  (ot,6,y,6,y),  the 
Bereitschafts  potential,  and  the  P-300  wave.  For  some  new  findings  in  this 
area  using  sophisticated  signal  analysis,  see  Chapman  et  al.  (21).  For 
improved  understanding  of  the  variability  of  responses,  invasive  micro¬ 
electrode  studies  have  become  quite  prevalent.  In  contrast  to  scalp 
potentials  they  provide  drastically  increased  spatial  and  frequency 
resolution  of  electric  nervous  activity. 

One  of  the  early  systematic  studies  of  variability  in  the  VER  is  found 
in  Ciganek  (25)  who  used  0  -P  bipolar  responses  to  flashes  (.3  Joule,  eyes 
closed)  with  random  intervalsz(3-6  sec).  Ciganek  describes  considerable 
intersubject  differences  analyzing  mean  amplitude  (at  peaks  of  response  wave 
some  10  msec  after  stimulus)  and  standard  deviation.  The  ratios  of  standard 
deviation  over  average  peak  amplitude  lie  between  .24,  for  the  "best" 
subject,  up  to  about  10,  for  the  "worst"  subject.  In  that  latter  subject 
not  only  the  mean  amplitude  decreased,  but  more  importantly  the  standard 
deviation  was  increased  by  a  factor  of  8  compared  to  the  "best"  subject. 

Very  interestingly,  Ciganek  reports  also  for  some  subjects  a  pronounced 
decrease  of  VER  variability  (standard  deviation  of  potential)  some  80  msec 
after  the  flash  stimulus.  Apparently  this  decrease  in  variability  arises 
from  a  general  response  of  the  brain  to  the  stimulus.  The  finding  of 
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modification  of  brain  activity  by  stimuli  is  also  in  agreement  with  the 
investigations  by  Lansing  and  Barlow  (61). 

The  relation  between  VER,  adaptation  attention  fatigue,  etc.,  has  been 
studied  quite  extensively  with  invasive  microelectrode  techniques  or 
microelectrodes  attached  to  the  scalp  (Riggs  and  Wooten  (90),  Van  de  Grind 
et  al.  (112),  Moise  and  Costin  (71),  Gould  et  al .  (41),  Salamy  and  McKean 
(96),  Oatman  and  Anderson  (75),  Pay  (79),  Schafer  (97),  Halgren  et  al .  (48), 
Rohrbaugh  et  al .  (92),  Kitajima  (56),  Kulikowski  et  al .  (57),  Drozdenko  (33), 
Hennessy  and  Levine  (51),  Grunewald  et  al .  (43)).  There  is  general 
agreement  on  the  importance  of  the  limbic  system  (related  to  emotion  and 
autonomic  control)  and  hence  in  microelectrode  studies  recording  sites  often 
include  the  hypothalmus  and  hippocampal  area  in  order  to  obtain  indicators 
for  the  arousal  state  of  the  animal.  The  typical  measurements  to  derive 
these  indicators  utilize  transmembrane  potentials  (slow  potential  variations) 
and  neural  firing  rates  of  single  cells.  By  these  techniques  the  effect  of 
alertness  or  drowsiness  of  the  test  animal  on  the  transfer  characteristics 
of  the  lateral  geniculate  bodies  (the  first  relay  stations  of  the  optic 
nerve,  layered  structures  where  binocular  interactions  first  time  take  place) 
has  been  shown. 

On  an  anatomical  level  (Hubei  and  Wiesel  (52))  back  projections  of 
fibers  from  cortical  layers  to  the  lateral  geniculate  bodies  have  been  shown. 
The  complexity  of  this  structure,  lateral  geniculate  bodies  and  visual 
cortex,  is  further  underlined  by  evidence  for  inputs  from  structures  other 
than  the  lateral  geniculate  bodies  (inferior  and  lateral  pulviriar,  Rezak  and 
Benevento  {39))  to  the  primary  visual  cortex  (Brodman's  area  17).  The 
existence  of  these  connections  underlines  the  capability  of  VERs  to  produce 
a  rich  set  of  responses  under  seemingly  identical  experimental  conditions. 
These  findings  also  suggest  not  viewing  the  visual  cortex  as  an  "isolated 
unit"  when  trying  to  model  certain  aspects  of  it.  Instead,  activity  in 
other  areas  may  be  important  in  explaining  activity  of  area  17. 

Along  these  considerations  an  interesting  aspect  of  VER  variability  is 
the  apparent  influence  of  the  phase  relations  between  a-waves  and  stimulus 
on  the  VER.  Work  by  Dustman  and  Beck  (34),  an  extension  of  earlier 
psychophysical  results,  aids  in  understanding  subjective  brightness 
enhancement  when  flash  stimuli  are  phase  locked  with  a-waves.  The  impor¬ 
tance  of  considering  various  components  of  the  EEG  is  underlined  by  this 
finding. 

The  modulation  transfer  function  (MTF)  as  a  function  of  space  and  time 
has  been  studied  by  Van  de  Grind  et  al .  (112),  and  they  give  their  results 
in  terms  of  isomodulation  lines.  However,  their  finding  should  be  taken 
with  some  care  since  Harter  and  Previc  (50) ,  investigating  variability  of 
the  MTF  quite  rigorously,  found  important  adaptive  processes  in  this 
transfer  system.  In  essence  they  studied  the  susceptibility  of  the  spatial 
MTF  to  changing  attention  to  the  stimulus  and  expectation  of  the  individual. 
By  analysis  of  response  amplitudes  at  a  specific  latency,  they  were  able  to 
show  an  actual  tuning  of  specific  frequency  channels.  For  evaluating  the 
MTF  the  experimental  sequence  of  stimuli  is  thus  important.  Negligence  of 
this  effect  would  clearly  increase  the  unexplained  variability  of  any 
results. 
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Several  investigators  have  been  concerned  with  the  effect  of  simul¬ 
taneous  stimulation  from  different  sensory  modalities.  The  frequent  finding 
of  strong  interactions  of  evoked  responses  to  these  paired  stimuli  (possibly 
with  some  intrastimulus  time  lag)  (Oatman  and  Anderson  (75),  Fox  (37), 

Stowell  (106))  suggests  the  control  of  at  least  some  of  them,  auditory 
stimuli  in  particular  (Desmedt  (29)). 

Oiscussing  variability  of  measures  of  VER  also  invites  the  question  for 
modifications  of  measures  which,  in  a  particular  context,  improve  perform¬ 
ance.  One  of  the  possible  augmentation  of  measures  is  the  P300  wave  shown 
to  be  a  sensitive  and  significant  indicator  of  attentiveness  (Orodzenko  (33)). 
However  it  appears  that  the  use  of  the  P300  wave  limits  stimulus  rates  to 
frequencies  below  3  Hz. 

Modifications  of  measures  of  VER  which  do  not  show  this  limitation  are 
very  desirable.  Two  findings  concerning  latency  and  spectral  properties 
might  be  important  for  the  U.S.  Air  Force.  De  Voe  et  al .  (30)  showed  that 
predictions  of  stimulus  luminance  based  on  latency  performs  markedly  better 
than  when  based  on  amplitude.  Somewhat  in  contrast  to  this  finding  might 
appear  the  recent  data  by  Osaka  and  Yamamoto  (77)  who  show  very  high 
correlation  between  amplitude  and  latency  when  luminance  is  changed.  Their 
particular  stimulus  condition,  a  1°  stimulus  source,  presumably  very  precise 
orientation  of  the  eyeball,  a  well -motivated  subject,  and  the  analysis  of 
very  early  response  waves  (PI)  might  in  part  account  for  their  result. 
Regarding  day-to-day  variations  they  find  reaction  time  and  response 
amplitude  to  change  less  than  0.5  log  units. 

The  second  finding  concerning  different  spectral  components  is 
discussed  in  Sokol  (103).  He  summarizes  that  "low  and  medium  frequency 
ranqe  reflect  poorly  the  psychophysically  determined  spectral  sensitivity 
functions,  while  the  high  frrquency  components  show  good  agreement  with 
photopic  spectral  sensitivity."  These  findings  suggest  the  expansion  of  the 
measures  derived  from  the  scalp  potentials.  Specifically,  high  frequency 
components  of  the  EEG  and  the  phase  relation  (a  relative  of  latency)  of  the 
VER  to  the  stimulus  should  then  be  included  in  the  analysis. 


Spatial  Properties  of  the  VER 

From  basic  neuroanatomic  and  physical  considerations  one  expects  to 
find  local  electric  activity  of  the  scalp  to  correlate  with  stimulus 
modality.  Clearly,  the  combined  consideration  of  spatial  and  time  proper¬ 
ties  of  the  VER  lead  to  some  experimental  and  data  acquisition  difficulties 
and  new  aspects  for  data  analysis.  On  the  experimental  side  the  main 
problems  are  reliable  reproduction  of  multi  electrode  arrays  and  on  the  data 
acquisition  side  high  digital  to  analog  conversion  rates.  For  careful 
mapping  of  the  entire  scalp  potentials  Ragot  and  Remond  (83)  recommend  about 

200  electrodes  (human),  spaced  about  2  cm  apart.  Currently,  however, 

development  of  potential  maps  is  usually  limited  to  tens  of  electrodes  and 

hence  some  spatial  smoothing  has  to  be  performed  in  order  to  arrive  at  such 

maps.  Some  of  these  maps  are  presented  and  discussed  in  Creutzfeld  and 
Kuhnt  (27),  Allison  et  al,  (6),  and  Goff  et  al .  (40).  These  maps  might 
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provide  guidance  for  selection  of  multiple  electrode  placement.  In  addition, 
timing  windows  for  data  acquisition  may  be  specified  in  order  to  reduce  the 
digital-to-analog  conversion  rate  requirements. 


The  EEG  of  Restrained  Monkeys 

In  many  VER  experiments  the  experimental  animal  is  restrained  in  its 
body  movements.  Studies  concerning  the  effect  of  restraint  on  EEG  were 
conducted  especially  in  the  case  of  monkeys.  It  was  found  that  with 
progressively  more  restraining  conditions  (head,  legs,  arms),  the  power 
spectral  density  of  the  EEG  shows  corresponding  progressive  changes  (Bouyer 
et  al.  (13),  Rougeul  et  al .  (93)).  In  these  studies  the  EEG  of  the  experi¬ 
mental  animal,  the  unrestrained  condition  (in  the  cage),  was  compared  to  the 
condition  when  the  animal  was  strapped  down  in  some  device.  Bouyer  et  al . 
(13)  suggest  the  use  of  an  anxiolytic  drug  (diazepam)  in  order  to  restore 
the  highly  abnormal  EEG  to  near  normal. 


Anatomical  and  Neurophysiological  Considerations  of  VER  Changes 

For  studies  of  visual  evoked  responses  one  usually  observes  the 
electric  potential  near  the  occiput.  The  anatomical  basis  for  the  use  of 
this  electrode  location  is  the  proximity  of  the  underlying  visual  cortex 
(area  17,  18,  19).  Consequently,  one  finds  relatively  large  electric  VER 
potentials  on  the  occiput  when  compared  to  the  locations.  It  is  necessary 
to  obtain  large  potentials  because  of  masking  noise-like  "background 
activity"  of  the  brain. 

From  various  field  mapping  techniques  and  neuroanatomic  investigations 
(Szentagothai  (107,  108),  Brooks  and  Jung  (15),  Sokol  (103),  Hubei  and 
Wiesel  (52))  it  is  found  that  the  visual  world  of  upper  half,  lower  half, 
left  half,  right  half,  and  a  macular  area  have  to  be  distinguished.  These 
areas  map  into  corresponding  areas  in  the  visual  cortex.  Interestingly  the 
macular  area  has  a  very  large  representation  in  terms  of  area  on  the  cortex 
when  compared  to  more  peripheral  regions.  Sokol  (103,  p.  25)  gives  as  a 
typical  value  for  this  representation  near  the  fovea  that  2  min  of  arc  in 
the  visual  world  correspond  to  1  mm  in  the  visual  cortex. 

Studies  by  Harter  (49),  in  qualitative  agreement  with  Riggs  and  Wooten 
(90,  p,  715),  show  that  the  central  3°  are  mainly  responsible  for  VER 
potentials  on  the  scalp.  Data  by  Sokol  (lOB,  Fig.  14)  shows  the  small 
contribution  of  stimuli  outside  that  central  3°  range.  From  these  experi¬ 
ments  a  relatively  large  response  can  be  expected  from  low-diameter  stimulus 
fields  (compare  Osaka  and  Yamamoto  (77))  provided  they  are  centered  foveal . 
However,  slight  displacement  (a  few  minutes  of  arc)  may  cause  changes  of 
observed  potentials,  because  the  electrode  location  does  not  follow  the 
corresponding  response  location  of  cortical  electric  sources.  An  additional 
aspect  arises  when  opposing  dipole  moments  are  nearly  cancelling  (possibly 
left  versus  right  hemisphere).  In  such  a  case  a  slight  spatial  shift  of  the 
stimulus  may  either  result  in  zero  potential,  or  amplitude  reversal.  That 
such  effects  deserve  attention  follows  from  the  distinct  properties  of  the 
response  components  Cl  (^  75  msec  delay),  C2  100  msec  delay),  and  C3 
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150  msec  delay)  described  by  Jeffrey  (in  Desmedt,  (28)  p.  136).  He  shows 
that  inversion  of  patterns  produces  corresponding  inversion  for  some  of  the 
potentials.  Hence,  for  at  least  certain  patterns,  cancellation  of  differen¬ 
tial  voltages  may  result.  In  principle  such  an  effect  may  contribute  to  the 
finding  of  some  researchers  (Sokol  (103))  of  rather  variable  amplitude  but 
fixed  response  times  at  a  given  luminance.  This  concept  might  also  be 
related  to  the  finding  of  "variable"  narrow-band  low-frequency  components 
compared  to  "less  variable"  high-frequency  components.  (High-frequency 
comnonents  may  never  attain  a  very  fixed  phase  relation  with  the  stimulus 
and  hence  cancellation  is  not  so  obvious.)  Anatomical  considerations  about 
the  mirror  image-like  mappings  between  areas  17,  18,  and  19  suggest  that 
these  contributions  are  separable  by  selecting  appropriate  electrode 
locations.  In  summary,  analysis  of  scalp  potentials  by  space,  time,  and 
frequency  properties  is  important  for  the  derivation  of  good  measures  of 
visual  performance. 


Signal  Transmission  from  Cortex  to  Scalp 

It  has  puzzled  neuroscientists  now  for  some  time  (F.G.  Worden,  1979, 
Director,  Neurosciences  Research  Program,  M.I.T.,  personal  communication; 
Pfurtscheller  and  Cooper  (80))  that  despite  the  large  high-frequency  content 
of  intracortical  recordings,  high  frequencies  on  the  scalp  are  very  small 
and  are  buried  in  electronic  noise.  Pfurtscheller  and  Cooper  inserted 
microelectrodes  into  cortical  regions,  passing  high-frequency  currents  through 
the  tissue.  No  particularly  selective  suppression  of  high-frequency  components, 
as  recorded  on  the  scalp,  was  observed;  they  call  for  an  explanation  other 
than  tissue  properties  to  be  responsible  for  the  surprisingly  weak  higfi^ 
frequency  components  on  the  scalp  as  they  arise  from  natural  cortical 
activity. 

To  get  a  grip  on  this  phenomenon,  it  appeared  worthwhile  to  us  to 
review  properties  of  other  bioelectric  potentials,  especially  those  which 
arise  from  a  large  number  of  similar  cells.  The  muscle  as  a  source  of 
bioelectric  potentials  comes  to  mind.  There  too,  one  observes  rather  weak 
high-frequency  components  on  the  superficial  skin,  while  internal  activity 
contains  strong  high-frequency  components.  Some  attempts  have  been  made  to 
account  for  this  phenomenon  (including  false  arguments  about  wave-guide 
phenomena),  but  the  most  successful  and  accurate  work  was  done  by  Lindstrbm 
and  Magnusson  (64).  His  theory  predicts  for  fibers,  conducting  action 
potentials  with  a  given  velocity,  a  power  spectral  density  on  the  skin 
(decreasing  with  increasing  frequency),  which  is  in  good  agreement  with 
experiments.  Without  going  into  detail  of  his  mathematical  derivation,  we 
just  point  out  that  the  phenomenon  of  small  high-frequency  components  is 
basically  due  to  the  travelling  of  action  potentials.  In  Appendix  L 
we  give  a  simple  outline  of  the  concept  which  was  solved  by  Lindstrbm  and 
Magnusson  (64)  for  special  geometries.  The  result  suggests  the  possibility 
of  tuning  sensor  electrodes  to  nearby  sources  by  selecting  high-frequency 
components.  To  exploit  a  range  of  frequencies  rather  than  a  single 
frequency,  the  frequency-dependent  properties  of  electrodes  become  important 
and  motivate  the  subsequent  section. 
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Frequency-Dependent  Properties  of  Macroelectrodes  (and  Amplifiers) 


A  good  introductory  treatment  of  the  transfer  characteristics  of 
electrodes  is  given  in  Cobbold  (26)  and  the  relevant  main  points  are 
reviewed  in  Appendix  B.  In  summary,  we  recall  that  the  typical  impedance  of 
these  electrodes  at  low  frequencies  is  around  10  kfi  (compare  also  Osaka  and 
Yamamoto  (77)).  Interestingly,  however,  as  higher  frequencies  are  used,  the 
impedance  falls  considerably.  To  assess  impedances  of  electrodes  empiri¬ 
cally,  electrodes  array  approaches  similar  to  Robillard  and  Poussart  (91) 
can  be  used. 

It  should  be  noted  that  the  frequency-dependent  transfer  function 
characterization  is  insufficient  to  understand  the  limitations  of  recording. 
The  frequency-dependent  noise  characteristics  set  the  ultimate  performance 
limit.  With  respect  to  these  noise  characteristics  the  work  by  Van  der  Zi el 
(113)  is  fundamental.  He  distinguishes  several  different  mechanisms  for 
noise,  the  most  important  ones  in  practice  with  electronic  components 
(including  electrodes)  being  the  1/f-noise  (mainly  due  to  quantum  mechanical 
effects  of  tunneling),  burst  noise  (with  an  insufficiently  understood 
mechanism  of  origin),  and  Schottky-noise.  The  1/f-noise,  or  flicker  noise, 
has  a  power  spectral  density  which  falls  with  frequency  f  like  1/f  and  is  of 
importance  for  very  low-frequency  noise  (including  drift)  up  to  frequencies 
of  a  few  hundred  Hertz.  The  burst  noise  has  similarly  a  moderately 
low-frequency  power  spectral  density,  while  the  Schottky-noise  behaves  like 
white  noise  from  DC  up  to  terahertz. 

The  characteristics  of  the  noise  suggest  different  limitations  for  the 
recording  of  low  versus  high  frequencies.  Van  der  Ziel  (113)  emphasizes  the 
importance  of  matching  amplifiers  to  the  frequency  band  of  i ntere. 

(possibly  using  different  amplifier  units  for  different  frequencies)  and 
using  amplifiers  with  certain  technologies  (e.g.,  use,  in  some  cases,  input 
pnp-transistors  rather  than  npn  transistors  or  FETs  and  use  certain 
semiconductor  cleavage  planes).  He  also  discusses  a  variety  of  aspects  in 
connection  with  the  design  of  amplifier  input  stages.  It  appears  to  us  that 
many  investigators  (personal  communications)  are  not  aware  of  some  of  these 
fundamental  principles.  The  view  currently  held  (Cobbold  (26))  is  to  use 
high-input  impedance  in  order  to  insure  good  common  mode  rejection. 

However,  in  connection  with  sophisticated  signal  processing  this  need  has 
not  yet  been  demonstrated. 


Summary 

In  summary,  it  is  seen  that  there  is  considerable  evidei :e  fx 
apparent  VER  variability  with  an  origin  other  than  just  signal  analy  is. 

The  origin  of  this  variability  is  mainly  neurophysiological.  Ways  to 
improve  prediction  or  classification  in  the  presence  of  variability  must 
come  from  improved  extraction  of  information  from  the  scalp  potential  field. 
This  can  be  accomplished  with  improved  techniques  for  VER  analysis  (as 
discussed  in  the  next  section)  by  consideration  of  an  increased  number  of 
electrodes  and  expanded  frequency  band  for  analysis  (such  as  the  analysis  of 
higher  harmonics  in  relation  to  frequencies  between  harmonics).  It  should 
be  recognized  that  a  priori  restriction  of  analysis  to  only  one  component  of 
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VER  (such  as  the  power  of  the  sinusoidal  4  Hz  component  of  one  electrode) 
merely  limits  the  potential  to  measure  visual  performance;  some  of  the 
important  facets  relevant  to  good  signal  analysis  were  shown  in  this  section. 
Finally,  along  the  lines  of  information  extraction  we  showed  ultimate 
limitations  for  information  extraction  due  to  the  electrode  and  amplifier 
design.  Interestingly,  it  appears  to  us  these  limitations  have  not  yet  been 
reached  (personal  communications,  F.  6.  Worden),  and  little  is  known  about 
information  extraction  of  high  frequencies.  As  a  result  of  these  observa¬ 
tions,  we  see  considerable  potential  to  aid  in  improving  upon  current 
results  of  the  U.S.  Air  Force. 


IMPROVED  TECHNIQUES  FOR  VER  ANALYSIS 

The  use  of  classical  spectral  analysis  techniques  for  VER  analysis  has 
been  discussed  in  the  section  on  "Analysis  of  Current  Processing."  These 
methods  are  quite  popular  in  EEG  analysis  and  have  the  advantage  of  being 
almost  universally  understood  in  terms  of  their  basic  properties.  However, 
they  possess  several  drawbacks  which  are  significant  for  VER  analysis: 

1.  Nonstationary  components  are  difficult  to  analyze.  The  VER 
contains  significant  nonstationarities. 

2.  Stochastic  effects  are  not  specifically  accounted  for.  The  VER 
contains  significant  stochastic  effects. 

3.  Relatively  long  data  epochs  are  required. 

4.  They  implicitly  assume  that  activity  is  wideband  when,  in  fact,  a 
more  appropriate  model  may  employ  only  a  relatively  few  generators. 

5.  They  are  data -independent;  that  is,  the  decomposition  is  the  same 
for  all  signals,  since  the  measured  signal  is  always  represented  as 
a  weighted  sum  of  sinusoids.  This  assumption  is  questionable  for 
signals  of  the  complexity  of  the  VER,  as  indicated  in  1  and  2. 

A  wide  variety  of  techniques  offer  potential  improvements  over  the  use 
of  classical  spectral  analysis.  These  will  be  briefly  discussed  in  this 
section.  More  detailed  discussions  are  presented  in  the  appendixes. 


Philosophy  of  Approach 

The  underlying  philosophy  of  approach  which  is  suggested  for  VER  signal 
processing  is  based  on  the  notion  that  all  available  prior  information 
should  be  brought  to  bear  on  the  problem.  For  example,  if  we  wish  to  model 
the  signal  in  some  fashion,  then  we  should  use  actual  data  to  build  the 
models.  We  should  incorporate  information  relative  to  known  spectral  limits 
(upper  and  lower  bounds),  characteristics  of  disturbances,  structure  of 
underlying  generators,  etc.  As  an  example,  the  discussion  in  the  section 
"Aspects  of  EEG/VER  Variability"  has  delineated  se/eral  research  results 
describing  the  importance  of  latency  variations.  Models  which  are  utilized 
should  thus  be  consistent  with  the  observed  latency  data,  in  terms  of  its 
relation  to  amplitude,  its  possible  rates  of  change,  etc. 
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The  approach  we  propose  is  model-based  and  is  a  combination  of 
phenomenological  and  physiological  components.  Phenomenological  modeling  is 
a  "black-box"  approach  which  is  often  useful  when  trying  to  emulate  complex 
signal  processes  for  which  no  adequate  models  exist.  Phenomenological 
models  are  developed  without  regard  to  underlying  physiological  structure; 
they  depend  only  on  trying  to  match  observed  phenomena.  The  simplest 
example  of  this  approach  would  be  modeling  the  alpha  wave  as  an  oscillator, 
with  some  additional  nuances  to  allow  for  observed  statistical  variation. 

The  physiological  approach  is  based  on  the  idea  of  using  known  structure  or 
structural  constraints  in  the  model.  Although  we  seek  to  use  physiological 
information  as  appropriate  (e.g.,  dynamics  of  the  eye),  it  is  felt  that  the 
overall  model  will,  of  necessity,  be  more  a  phenomenological  one. 

We  seek  parsimonious  models;  models  which  are  too  complex  almost 
invariably  lead  to  high  noise  sensitivity.  In  the  section  "Analysis  of 
Current  Processing,"  it  was  demonstrated  that  the  variability  observed  in 
the  U.S.  Air  Force  data  was  consistent  with  very  simple  statistical  models. 
Such  simple  models  can  form  an  important  benchmark  in  design  of  models. 

For  example,  one  needs  to  go  to  more  sophisticated  models  only  if  the  simple 
ones  prove  inadequate.  In  addition,  the  modeling  errors  can  often  reveal, 
by  their  nature,  what  the  appropriate  next  modeling  step  is.  By  this 
process,  a  series  of  increasingly  sophisticated  models  can  be  formed,  as 
required,  with  reasonable  assurance  that  the  models  are  not  overly 
sophisticated. 

The  approach  we  propose  is  not  just  based  on  analysis  of  the  VER  but 
utilizes  the  observed  signal  structure  directly.  Function  is  always  linked 
to  structure.  In  the  past  decade,  much  progress  has  been  made  towards 
understanding  the  ways  in  which  information  is  communicated  and  processed  in 
the  brain.  In  addition,  recent  experiments  have  strongly  suggested  that  the 
EEG  itself  is  a  "second  signal  system"  (Adey  (4))  to  which  the  brain  cells 
are  tuned.  That  is,  the  EEG  signal  exerts  an  influence  on  the  way  in  which 
information  is  communicated  and  processed  within  the  brain.  If  this  is 
indeed  true,  then  it  may  be  possible  to  eventually  exert  some  stabilizing 
control  over  the  brain  through  the  use  of  weak  external  electric  fields. 

We  will  not  pursue  the  implications  of  external  control  here.  Rather, 
we  wish  to  point  out  the  importance  of  these  experimental  results  on  the 
philosophy  of  approach  to  EEG,  especially  VER,  analysis.  Consider  t.he 
simple  lumped  parameter  model  shown  in  Figure  12.  In  Figure  12  (a),  the 
EEC  r*( t)  is  viewed  as  an  output  function  consisting  of  noise-corrupted 
signal : 


e(t)  =  S(t)  +  n(t)  (3) 

In  Figure  12  (b),  the  EEG  is  viewed  as  a  second  signal  system;  thus  e(t) 
contains  information.  The  EEG  depicted  in  Figure  12  (a)  may  contain  almost 
no  information,  since  there  is  no  a  priori  bound  on  the  noise  power;  thus, 
the  EEG  viewed  as  output  could,  in  principle,  be  almost  entirely  buried  in 
noise.  On  the  other  hand,  the  "second  signal  system"  model  puts  practical 
upper  limits  on  the  signal-to-noise  ratio.  Since  the  brain  utilizes  the  EEG 
as  a  source  of  information,  the  signal  cannot  be  almost  entirely  buried  in 
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noise;  extraction  of  information  would  be  rendered  extremely  difficult.  The 
implication  of  this  result  is  that  the  EEG  should,  if  viewed  correctly, 
supply  us  with  information  related  to  on-going  brain  function.  We  are  aware 
of  no  experimental  results  which  would  invalidate  this  conclusion.  We  wish 
to  stress  this  point  here  since  it  has  impact  on  the  philosophy  we  wish  to 
use  in  analyzing  EEG  data. 

Since  function  is  inevitably  related  to  structure,  we  can  conclude  from 
the  preceding  argument  that  the  EEG  may  provide  information  as  to  the  states 
of  groups  of  cells  and,  more  importantly,  to  changes  in  the  states  of  groups 
of  cells.  If  this  can,  in  fact,  be  done,  it  is  strongly  suggested  that  more 
sophisticated  analytical  methodologies  will  be  required  than  have  been  used 
heretofore.  In  his  paper  Adey  (4)  says  that 

lack  of  (new  mathematical  and  statistical  methods)  remains  a 
critical  bottleneck,  in  which  engineering  application  has 
seriously  failed  to  keep  pace  with  new  physiological  knowledge  on 
the  temporal  and  spatial  organization  of  brain  tissue  and  brain 
systems  ....  our  paths  to  an  understanding  of  brain  function  must 
surely  falter  and  fail  unless  and  until  ways  are  found  for 
mathematical  expression  and  analysis  of  the  multidimensional  and 
hierarchical  organization  of  cerebral  information  transaction. 

We  agree  with  this  conclusion  and  would  add  the  following  points: 

1.  To  our  knowledge,  EEG  analysts  have,  as  yet,  not  utilized  several 
powerful  analysis  tools  already  available.  These  include  several 
techniques  of  communication  theory,  adaptive  filtering,  and 
generalized  state  space  modeling  techniques.  These  are  discussed 
in  the  appendixes. 

2.  More  serious  attention  has  to  be  given  to  the  stochastic  aspect  of 
the  problem,  so  that  information-bearing  signals  are  not  treated  as 
noise  and  true  noise  is  filtered  effectively. 

3.  A  methodology  is  required  to  incorporate  the  idea  of  multiple, 
nonlinearly  interacting,  generators. 

4.  Since  the  EEG  is  a  manifestation  of  a  distributed  communication  and 
information  processing  network,  a  distributed  process  model  should 
eventually  be  developed.  This  would  probably  require  a  large 
number  of  electrodes  and  models  based  on  distributed  network,  or 
larye  scale  systems,  theory.  These  steps  are  in  the  future,  of 
course,  but  are  an  indication  of  the  large  amount  of  work  that  yet 
needs  to  be  done  in  EEG  signal -processing  development. 

These  points  have  been  mentioned  here  to  reinforce  Adey's  very 
important  observation  that  the  present  state-of-the-art  in  EEG  signal 
processing  is  seriously  lacking.  The  methods  which  we  discuss  here  are 
based  on  the  present  state-of-the-art  in  signal  analysis.  Thus,  these  tech¬ 
niques  can  be  tried  with  a  minimum  of  software  development  time. 
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EEG/VER  Signal-Processing  Objectives 


Two  types  of  objectives  may  be  enumerated  in  EEG/VER  signal  processing. 

The  first  is  a  general  set  of  objectives  which  would  be  applicable  to  a 
larger  class  of  problems  as  well;  for  example,  all  biomedical  signal -process¬ 
ing  problems.  The  second  type  of  objective  relates  to  the  specific  problem 
at  hand,  i.e.,  EEG/VER  signal  processing,  and  includes  determination  of  de¬ 
scriptive  characteristics  of  the  EEG/VER  signal;  e.g.,  characteristics  not 
shared  by  other  types  of  biological  signals. 

General  Objectives.  The  general  objectives  of  EEG/VER  signal  processing  are 

1.  provide  tools  for  extracting  useful  information 
from  the  EEG  signal; 

2.  perform  information  compression;  and 

3.  remove  noise  from  the  desired  signal  processes. 

Tools  to  be  sought  for  information  extraction  should  be  general  and 
powerful.  They  will  be  determined  on  the  basis  of  extensive  analysis  of  the 
data  to  determine  the  character  of  the  signal  and/or  noise.  Note  that  the 
first  objective  implies  that  a  measure  of  useful  information  be  available. 

This  may  or  may  not  be  a  mathematical  function.  It  could  be  based  on  a 
statement  such  as  "we  wish  to  eliminate  variability  between  these  particular 
measured  signals"  or  "we  wish  to  obtain  a  smooth  response  to  this  flash 
impulse."  In  this  case,  information  may  be  only  vaguely  defined  mathemati¬ 
cally;  however,  an  improvement  in  information  extraction  will  be  readily 
apparent  by  visual  evaluation  of  the  output  of  the  signal  processor.  In  a 
more  mathematical  approach,  measures  such  as  entropy,  rms  fit  error,  or 
others  may  be  used. 

Information  compression  is  a  very  important  step.  Raw  VER  data, 
sampled  at  a  rate  of  250  samples/sec,  for  example,  quickly  fills  up  a 
digital  storage  medium,  especially  when  multiple  electrodes  are  used.  Not 
all  of  the  data  carries  information.  As  a  matter  of  fact,  the  information 
rate  is  probably  very  low.  The  implication  is  that  if  we  can  extract  only 
the  useful  information,  a  tremendous  reduction  in  storage  can  be  realized. 

More  importantly,  however,  is  that  this  information  is  what  we  need  in  order 
to  properly  analyze  the  data.  Information  extraction  is  generally  accom¬ 
plished  by  employing  modeling  techniques,  in  which  the  signal  is  described 
with  a  parsimonious  set  of  parameters.  These  parameters,  then,  are  the 
carriers  of  information. 

Once  the  signal  process  has  been  adequately  described  by,  say,  a 
parametric  model,  it  remains  to  remove  the  undesired  disturbances.  Disturb¬ 
ance  models  may  be  utilized,  which  may  be  generated  based  on  statistical 
analysis  of  the  oata.  Some  disturbances  may  have  specific  characteristics 
(spike  and  wave,  e.g.)  which  can  be  used  to  advantage  in  detecting  and 
removing  them.  The  VER  has,  characteristically,  a  large  amount  of  noise 
relative  to  signal,  which  is  why  a  single  VER  is  rarely  used  for  analysis. 
Typically  many  VERs  are  averaged.  Although  this  does  tend  to  bring  out  a 
mean  VER  signal,  any  variations  between  responses  which  carry  information 


tend  to  get  averaged  out.  We  propose  to  use  techniques  which  can  be  used  to 
process  a  single  VER. 

A  general  model  of  the  process  by  which  the  measured  EEG  is  generated 
is  given  below. 


Information  is  thought  of  as  being  modulated  by  one  or  a  set  of  modulating 
functions  of  unknown  character.  For  example,  if  the  EEG  contains  phose- 
modulated  signals,  then  the  information  we  seek  would  be  carried  by  a  phase 
process  <|>(t).  Assuming  simple  sinusoidal  carrier  modulation  at  frequency  w, 
the  EEG  in  the  absence  of  disturbances  would  be  sin(wt+i|>(t)) ,  and  ${t)  could 
be  recovered  using  phase-lock  loop  techniques  described  in  Appendix  D.  Note 
that  if  <t>(t)  were  a  constant,  the  modulation  transforms  the  constant  <j>  into 
a  continuous  time  process  with  infinite  duration.  The  disturbances  d,  and 
d„  may  be  correlated  in  practice.  Using  this  general  model,  our  objective 
is  to: 

1.  remove  disturbance  dp; 

2.  recover  the  undisturbed  modulating  function;  and 

3.  demodulate  to  recover  the  desired  information. 

Specific  Objectives.  The  specific  objectives  are  predicated  upon 
satisfaction  of  the  general  objectives  just  discussed.  The  objectives, 
along  with  analysis  tools  and  subtasks,  are  listed  below  in  approximate 
chronological  order. 

EEG/VER  Data  Analysis 

compressed  spectral  arrays 
correlation  analysis 
transient  response 
effect  of  VER  averaging 
cross-correlation  analysis 

(a)  between  patients 

(b)  same  patient/different  times 
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Model  Development 

overall  signal  processor  structure 
physiological  considerations 
signal  models 
noise/disturbance  models 

Performance  Evaluation 

parameter  identification 
model  sensitivity  determination 
model  validation 
information  extraction 
pattern  recognition 

Redefinition  Phase 

project  redefinition 
experiment  design 

(a)  enhance  model  validation 

(b)  disturbance  reduction 

(c)  enhance  performance 
model  redefinition 
performance  evaluation  criteria 

This  list  is  meant  to  represent  an  outline  of  objectives  and  there  will 
be  a  synergism  between  the  specific  topics  listed.  For  example,  some 
modeling  may  be  performed  prior  to  data  analysis  to  better  define  the  data 
analysis  objectives.  Prefiltering  may  be  necessary  to  reduce  disturbances 
in  regions  outside  the  spectral  bands  of  interest. 

Data  analysis  is  the  initial  information-gathering  phase  in  which  we 
seek  to  learn  as  much  as  possible  about  the  measured  data.  An  important 
aspect  here  is  the  sensitivity  of  the  analysis  to  uncontrolled  or  unmeasured 
changes.  Model  development  will  proceed  in  earnest  in  the  next  phase.  It 
is  expected  that  both  simple  and  sophisticated  models  will  be  developed. 
However,  the  preferred  approach  will  be  to  utilize  simple  models  initially 
and  investigate  the  conditions  under  which  these  are  inadequate.  This 
approach  will  provide  insiqht  into  development  of  more  appropriate  and 
complex  models  at  a  later  stage. 

Evaluation  of  performance  will  be  assessed  using  a  variety  of  tech¬ 
niques.  Robustness  and  sensitivity  will  be  evaluated.  Artificial  data  will 
be  generated,  as  required,  to  provide  controlled  inputs  and  disturbances. 
Pattern  recognition  techniques  will  be  used,  as  necessary,  to  assist  in 
evaluation  of  the  signal  processing  results  in  case  there  are  more  than  two 
or  three  parameters  in  the  models  (which  we  expect). 

Redefinition  of  the  problem  and  proposed  solutions  are  expected  to  take 
place  during  all  of  the  above  phases,  as  new  information,  other  research 
findings,  test  data,  etc.,  become  available. 
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Discussion  of  Techniques 


We  now  present  a  brief  discussion  of  several  techniques  which  may  be 
useful  for  VER  interpretation  and  analysis.  The  reader  is  referred  to  the 
appropriate  appendix  for  a  more  detailed  description  of  the  techniques. 

Frequency-Domain  Signal-Processing  Techniques.  There  are  several 
different  types  of  frequency-domain  signal-processing  techniques,  all  of 
which  are  based  on  a  decomposition  of  the  data  into  a  set  of  spectral 
components  (sinusoids).  Care  must  be  taken,  when  using  digital  processing, 
that  the  sampling  rates  are  high  enough  to  capture  the  frequencies  of 
interest  without  aliasing.  In  addition,  sampling  windows  must  be  wide 
enough  to  capture  the  lowest  frequencies  of  interest. 

The  Fast  Fourier  Transform  (FFT)  is  perhaps  the  most  popular  spectral 
decomposition  technique  for  EEG/VER  analysis.  A  detailed  discussion  of  the 
properties  of  the  FFT  when  used  for  analysis  of  the  VER  under  a  periodic 
visual  stimulation  is  given  in  the  section  "Analysis  of  Current  Processing." 

Recently,  several  alternate  techniques  have  been  developed  for  spectral 
estimation  which  offer  advantages  over  the  FFT  in  noisy  environments  and 
when  the  data  epochs  are  rather  short,  perhaps  only  a  partial  cycle  at  the 
lowest  frequencies  of  interest.  These  are  the  Maximum  Entropy  Method  (MEM) 
and  the  Maximum  Likelihood  Method  (MLM). 

A  detailed  discussion  of  frequency-domain  signal-processing  techniques 
is  given  in  Appendix  C. 

Communication-Theoretic  Methods.  These  methods  are  model -based 
techniques  of  signal  tracking  and  may  prove  useful  in  VER  signal  analysis. 
The  models  are  based  on  the  assumption  that  the  VER  or  EEG  is  a  process 
composed  of  signal  and  noise  components.  The  signal  components  we  wish  to 
track  are  further  assumed  to  be  composed  of  one  or  more  periodic  processes 
which  are  modulated.  The  modulation  which  we  wish  to  recover  is  the 
information  carrier.  As  an  example,  latency  variations  in  the  VER  could  be 
tracked  using  this  approach.  It  is  well  known  that  the  phase  coherence  in 
the  spontaneous  EEG  is  affected  by  attention.  This  may  also  be  true  in  the 
VER,  although  to  a  lesser  extent. 

It  appears  that  the  use  of  phase-lock  loops  is  an  attractive  approach 
to  the  problem  of  signal  tracking  and  recovery  of  modulation  information. 
These  can  be  designed  to  recover  amplitude  modulation  (AM),  phase  modulation 
(PM),  frequency  modulation  (FM),  or  a  combination  of  these.  In  addition, 
other  modulation  models,  such  as  phase  or  frequency  shift  keying,  can  be 
tried. 

A  detailed  discussion  of  communication-theoretic  methods  and  phase-lock 
loops  is  given  in  Appendix  D. 

Nonadaptive  Time-Domain  Analysis.  Time-domain  signal-processing 
techniques  have  the  advantage  of  working  directly  with  data  as  it  evolves  in 
time.  Typically,  time-domain  techniques  are  recursive  in  nature;  that  is, 
the  signal -parameter  estimation  process  evolves  in  time  along  with  the 
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actual  data.  The  most  commonly  used  approaches  utilize  stochastic  Markov 
process  models  for  the  observed  data;  these  models  are  consistent  with  the 
need  to  build  recursive  signal  processors 

Perhaps  the  most  important  class  of  time-domain  models  are  auto¬ 
regressive  moving-average  (ARMA)  processes.  These  have  been  used  to  model 
the  spontaneous  EEG  (e.g.,  Bohlin  (12),  Zetterburg  and  Kjell  (121),  Segen 
and  Sanderson (100) )  and  should  have  application  to  modeling  of  the  VER  as 
well . 


A  more  general  approach  to  time-domain  tracking  of  signals  is  the 
Kalman  Filter.  This  filter  utilizes  a  linear  Markov  process  model  and  a 
linear  measurement  model.  The  underlying  dynamical  process  is  driven  by 
a  white  noise  process  and  the  measurements  are  assumed  to  be  noisy.  There 
are  several  advantages  of  Kalman  Filters  over  the  ARMA  modeling  approach. 

The  structure  of  the  filter  is  a  more  intuitive  one,  allowing  the  designer 
to  better  use  his  judgment  in  constructing  the  filter.  An  even  more 
significant  advantage  is  the  fact  that  model  identification  is  much  simpler, 
and  specialized  software  exists  for  determining  the  structure  and  estimating 
the  parameters  of  the  Kalman  Filter  directly  from  time-series  data. 

Finally,  a  third  advantage  lies  in  the  structure  of  the  filter  itself.  It 
is  relatively  easy  to  model  nonlinear  effects  and  account  for  known  time 
variation  of  model  parameters.  The  Kalman  Filter  is  backed  by  almost  20 
years  of  theoretical  study  and  application  to  a  diverse  set  of  problems  in 
many  fields  including  seismology,  geology,  biological  signal  processing, 
space  navigation,  economic  and  financial  forecasting,  meteorology,  hydrology, 
image  analysis,  radar,  and  sonar  tracking.  In  short,  the  Kalman  Filter  has 
proved  to  be  beneficial  in  estimation  and  tracking  problems  where  there  are 
many  variables  to  be  simultaneously  estimated  and  the  signal  process  to  be 
estimated  evolves  essentially  as  a  stochastic  Markov  process.  The  EEG/VER 
signal-analysis  problem  should  be  amenable  to  this  approach  since  it  appears 
to  satisfy  these  requirements. 

A  further  discussion  of  nonadaptive  time-domain  approaches  is  given  in 
Appendix  E. 

Nonlinear  Systems  Analysis.  The  techniques  discussed  to  this  point 
have  been  based  on  linear  systems  analysis  in  which  the  superposition 
principle  holds;  that  is,  it  has  been  implicitly  assumed  that  the  evoked 
response  is  a  superposition  of  several  responses  and  that  the  total  response 
is  a  linear  combination  of  these  responses.  Furthermore,  linearity  implies 
the  absence  of  saturation  or  hysteresis  phenomena.  It  is  well  known  that 
there  are  many  nonlinear  phenomena  underlying  evoked  responses,  most 
fundamentally  in  the  generation  of  electrical  potentials  via  the  synapse. 

At  a  higher  level,  these  nonlinear  effects  may  not  be  apparent  directly. 
However,  they  may  manifest  themselves  in  the  evoked  response  via  entrainment 
or  saturation  phenomena,  which  have  been  observed  often  in  EEG  analysis. 

Nonlinear  analysis  is  much  more  difficult  than  linear  analysis  and,  for 
this  reason,  no  general  tools  exist  which  are  appropriate  for  all  problems. 
However,  several  tools  have  been  developed  which  may  prove  useful  in 
analysis  of  the  evoked  potentials.  Two  of  these  are:  (1)  describing 
function  analysis,  (2)  Volterra  series  analysis.  These  are  described  in 
some  detail  in  Appendix  F. 
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Adaptive  and  Robust  Schemes.  There  is  strong  evidence  that  the  nature 
of  the  evoked  response  may  change  significantly  in  time  over  both  long  and 
short  epochs.  The  nature  of  these  changes  may  be  evidenced  in  several  ways, 
including  spectral  content,  amplitude,  rms  variation,  etc.  There  may  be 
short  transient-like  phenomena  which  may  be  part  of  the  signal  process  or 
may  be  due  to  extraneous  effects  (artifacts)  we  wish  to  remove.  All  of 
these  effects  represent  relatively  unpredictable  phenomena  and  the  attendant 
evoked  responses  are  then  nonstationary  processes.  We  need  approaches  which 
can  adapt  to  information-bearing  changes  in  the  observed  data  and  which  are, 
at  the  same  time,  robust  (i.e.,  insensitive  to  noise,  artifact,  and  other 
extraneous  clutter). 

In  order  to  track  nonstationary  processes,  more  sophistication  is 
required  than  for  stationary  processes,  since  the  observed  data  are 
qualitatively  more  complex.  Several  methods  have  been  developed,  however, 
which  are  felt  to  be  particularly  attractive  for  analysis  of  VER  data. 

These  methods  can  be  utilized  to  analyze,  simultaneously,  data  from  multiple 
leads,  and  can  do  this  without  the  requirement  of  growing  memory  for  longer 
data  epochs.  In  addition,  real-time  analysis  may  be  possible  for  few  leads 
and  for  simple  models.  Such  (close  to)  real-time  tracking  and  parameter 
estimation  can  be  of  great  help  in  assessing  the  results  and  validity  of  a 
particular  test  shortly  after  or  even  during  the  test  itself. 

The  methods  which  appear  to  have  merit  for  nonstationary  VER  analysis 
are:  (1)  adaptive  noise  cancelling,  (2)  adaptive  ARMA  modeling  (analysis  of 
changing  spectra),  (3)  adaptive  Kalman  filtering,  (4)  piecewise-stationary 
model i ng . 

Adaptive  noise  cancelling  is  a  heuristic  technique  based  on  the 
assumption  that  the  observed  data  in  a  particular  lead  consists  of  signal 
plus  noise,  with  the  signal  and  noise  components  correlated  in  a  known 
qualitative  way  with  signal  and  noise  ;omponents  in  adjacent  leads. 

Adaptive  estimation  using  ARMA  modeling  or  Kalman  filtering  techniques 
employs  more  structure  for  the  signal  process;  the  underlying  signal  is 
modeled  as  a  stochastic  Markov  process  of  known  order  and  form.  The 
parameters  of  the  model  are  then  estimated  recursively  and  used  to  infer  and 
track  changes  in  the  characteristics  of  the  evoked  response.  Bohlin  (12) 
has  successfully  applied  adaptive  AR  modeling  to  tracking  the  spontaneous 
EE6,  and  it  appears  that  this  technique  should  also  be  applicable  to  the  VER. 

Piecewise  stationary  modeling  of  the  VER  is  based  on  the  idea  that  the 
VER  can  be  adequately  modeled  as  a  stationary  process  over  sufficiently 
short  data  epochs.  These  epochs  are  separated  by  points  of  transition  of 
which  the  signal  process  is  assumed  to  jump  from  one  type  to  another  type. 
Thus,  changes  in  the  behavior  of  the  VER  are  assumed  to  occur  in  discrete 
steps  rather  than  continuously  over  time.  Several  methods  are  available  to 
handle  this  type  of  process.  The  most  appropriate  method  to  be  used  depends 
on  the  nature  of  the  jump.  If  the  time  between  jumps  is  relatively  long, 
and  the  number  of  different  signal  types  relatively  distinct  and  small, 
multiple  hypothesis  testing  methods  involving  a  bank  of  Kalman  Filters  might 
be  appropriate  (Lainiotis  and  Park  (59)).  These  have  been  successfully 
applied  by  Scientific  Systems  to  ECG  rhythm  analysis  (Gustafson  et  al .  (45)). 
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If  there  are  a  large  number  of  different  signal  types,  multiple  AR  models 
can  be  derived  using  cluster  analysis.  This  method  has  been  successfully 
applied  to  analysis  of  the  spontaneous  EEG  by  Segen  and  Sanderson  (100).  If 
the  time  between  jumps  is  relatively  short  or  if  there  are  transient 
artifacts,  a  generalized-likelihood  ratios  approach  (Will sky  and  Jones  (118)) 
may  be  more  appropriate.  This  technique  allows  one  to  both  eliminate 
artifacts  and  identify  particular  types  of  transients,  as  desired.  This 
approach  has  been  successfully  applied  to  the  detection  and  identification 
of  cardiac  rhythms  using  the  ECG  by  Scientific  Systems  (Gustafson  et  al .  (46)). 

All  of  the  techniques  mentioned  above  have  robustness  properties,  since 
they  are  designed  to  be  insensitive  to  noise  and  other  artifacts.  The 
choice  of  the  most  appropriate  technique  must  await  detailed  analysis  of  raw 
VER  data  on  a  variety  of  subjects  and  under  a  wide  variety  of  conditions.  A 
further  discussion  of  adaptive  and  robust  techniques  is  given  in  Appendix  G. 

Feature  Extraction  and  Pattern  Recognition  Techniques.  The  previous 
subsections  have  been  based  on  the  notion  that  it  is  possible  to  generate 
structural  models  of  the  VER  and  then  infer  the  VER  characteristics  from  the 
parameters  of  the  model.  It  may,  in  fact,  turn  out  that  it  is  not  possible 
to  generate  adequate  structural  models  of  the  VER.  For  example,  the 
required  number  of  parameters  may  be  too  large. 

If  this  turns  out  to  be  the  case,  it  might  be  more  appropriate  to 
utilize  pattern  recognition  techniques.  Pattern  recognition  may  also  be 
useful,  if  parametrized  models  are  employed,  to  analyze  the  relationships 
between  the  parameters. 

Pattern  recognition  is  an  approach  which  is  essentially  model-free;  it 
depends,  however,  upon  having  a  sufficiently  large  data  base  on  hand.  This 
approach  has  been  found  to  be  particularly  useful  in  many  biomedical  signal 
processing  problems,  simply  because  of  the  inability  to  develop  meaningful 
models  of  biomedical  signals.  The  ECG  is  a  good  example  of  this.  Although 
apparently  more  simple  in  nature  than  the  EEG,  no  parametric  model  presently 
exists  which  can  adequately  capture  the  variations  seen  in  the  ECG  signal. 
Recourse  has  inevitably  been  made  to  the  tools  of  pattern  recognition. 

Pattern  recognition  generally  takes  olace  in  two  steps:  (1)  feature 
extraction  wherein  a  parsimonious  representation  of  the  raw  data  is  sought, 

(2)  classification  wherein  the  features  are  utilized  in  a  decision  rule  to 
identify  the  pattern  of  the  original  data. 

The  feature  extraction  step  is  extremely  important,  since  we  wish  to 
obtain  an  accurate  representation  of  the  data  using  as  few  parameters  as 
possible.  A  particularly  powerful  technique  which  appears  appropriate  for 
VER  analysis  is  the  Karhunen-Loeve  expansion.  By  this  technique  it  is 
possible  to  represent  the  time-synchronized  VER  (cf.  Figure  2)  as  a  weighted 
combination  of  predetermined  basis  functions.  This  technique  has  been 
successfully  applied  to  representation  of  the  ECG,  and  further  discussion  of 
this  approach  is  given  in  Appendix  M. 

Once  feature  extraction  has  been  performed  it  remains  to  extract  the 
desired  information  from  the  numerical  values  of  the  features.  Assuming 


there  are  more  than  two  or  three  features,  this  information  extraction  can 
best  be  performed  using  pattern  classification  techniques.  This  involves 
the  use  of  some  type  of  decision  logic  to  discriminate  between  patterns. 

This  decision  logic  can  be  formed  using  two  different  types  of  approaches: 

(1)  supervised  learning,  (2)  unsupervised  learning,  or  cluster  analysis.  In 
suoervised  learning,  the  pattern  of  each  VER  is  known;  that  is,  each  VER  can 
be  labeled  according  to  some  typifi cation.  Decision  rules  are  then  gener¬ 
ated  to  correctly  classify  each  of  the  known  cases.  In  unsupervised  learning, 
such  labeling  is  not  used  (generally  it  would  not  be  available)  and,  in 
addition,  the  number  of  distinct  classes  of  responses  are  not  known.  A  wide 
variety  of  techniques  are  available  for  both  supervised  and  unsupervised 
pattern  recognition,  and  the  choice  of  technique  depends  upon  the  nature  of  the 
problem  at  hand.  It  is  generally  true,  however,  that  supervised  learning  is 
preferred  assuming  that  labeling  of  the  responses  can  be  carried  out.  Further 
discussion  of  pattern  recognition  techniques  is  given  in  Appendix  M. 


CONCLUSIONS  AND  RECOMMENDATIONS 

Based  on  our  analysis  of  data  supplied  by  the  U.S.  Air  Force  and  a  review 
of  the  relevant  literature,  we  conclude  that  the  observed  variabi  ity  in  the 
processed  data  is  most  probably  due  to  the  small  averaged  evoked  response 
amplitude  relative  to  the  background  EEG.  This  results  in  a  low 
signal  (response)-to-noise  (EEG)  ratio,  which  is  a  severe  handicap  to  the 
current  U.S.  Air  Force  processing  technique,  as  discussed  in  the  section 
"Analysis  of  Current  Processing."  The  basic  problem  of  VER  variability  has 
been  noted  by  many  other  researchers,  however,  although  under  different 
experimental  conditions,  as  indicated  in  the  section  "Aspects  of  EEG/VER 
Variability."  Nonetheless,  we  believe  the  variability  may  be  reduced  by 
alternate  processing  techniques  and  possible  experiment  modification. 

To  reduce  the  variability  by  modifying  the  signal  processing,  an  analysis 
of  the  raw  data  (measured  EEG  and  VER  signals  as  currently  recorded)  is 
necessary.  Some  specific  tests  that  should  be  performed  are  outlined  in  the 
subsection  "Remarks"  on  page  21.  The  complete  analysis  procedure  will  depend  on 
the  results  from  these  early  tests,  of  course,  and  is  hard  to  specify  at  this 
time.  The  analysis  of  the  raw  data  is  the  most  important  step  towards  reducing 
variability.  Using  this  data,  a  signal  model  can  be  developed  and  a  processing 
technique  selected  from  those  described  in  the  section  "Improved  Techniques  for 
VER  Analysis."  If  the  performance  of  the  improved  techniques  is  not  sufficient 
to  meet  the  experiment  objectives,  modification  of  the  experiment  and  redesign 
of  the  processing  (to  fit  the  new  experiment)  will  be  necessary. 

A  list  of  recommendations  for  reducing  variability  through  experiment 
modification  is  given  in  Appendix  N.  These  techniques  have  been  extracted 
from  the  VER  literature,  and  may  be  reviewed  to  determine  whether  any  proven 
techniques  are  not  now  being  used  but  may  be  incorporated  without  violating  any 
experiment  ground  rules  or  constraints.  If  the  use  of  these  techniques  does 
not  sufficiently  reduce  variability,  then  a  modification  of  the  imposed 
constraints  may  be  necessary  to  achieve  higher  signal -to-noise  ratios. 

Finally,  if  major  modification  of  the  experiment  is  needed,  modern  signal¬ 
processing  techniques  may  be  used  to  help  design  new  experiments  and  their 
associated  processing,  as  discussed  in  Appendix  H. 
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APPENDIX  A 


BACKGROUND  OF  VER/EEG  EXPERIMENTS  AND  ANALYSIS 


The  analysis  of  electrical  potentials  from  the  surface  of  the  human 
head  is  assuming  increasing  importance  in  the  evaluation  of  brain  functions, 
since  these  potentials  reflect  the  activity  of  the  brain.  The  use  of 
surface  potentials  is  also  convenient  for  the  investigation  of  brain 
functions  from  other  primates  because  the  technique  is  relatively  noninva- 
sive  and  hence  many  experiments  can  be  carried  out  on  a  single  subject. 
However,  a  price  is  paid  for  this  convenience  in  terms  of  the  information 
ultimately  available,  especially  about  the  fine  spatial  distribution  of 
neural  activity. 

The  main  problems  experienced  in  the  analysis  of  VER/EEG  signals  are 
high  signal  variability,  complexity  of  patterns,  and  superimposed  noise: 
noise  is  usually  separated  conceptually  into  noise  of  biological  origin'and 
noise  due  to  recording  equipment.  The  concept  of  biological  noise  should 
always  be  viewed  with  some  suspicion;  what  might  appear  worthless  varia¬ 
bility  of  a  signal  to  one  investigator  may  be  an  informative  feature  to 
another  investigator  who  is  able  to  relate  it  to  a  neurophysiological 
mechanism. 

From  a  more  global  point  of  view  the  problems  experienced  result  from 
the  complex  and  poorly  understood  neurophysiology  of  the  brain  with  its  many 
inputs,  the  large  number  of  experimentally  uncontrollable  quantities,  and 
the  relatively  small  energy  turnover  (=10"''  W/nerve).  Not  all  of  the 
energy  turned  over  by  the  brain  is  converted  into  electrical  energy  since 
only  a  few  nerve  fibers  are  electrically  active  (Abeles  (1)),  and  usually 
they  are  quite  distant  from  the  recording  electrodes.  In  addition,  surface 
potentials  do  not  have  a  one-to-one  correspondence  with  internal  brain 
activity  which  raises  questions  of  observability  (i.e,,  the  capability  of 
discriminating  brain  activity  at  different  locations). 

The  high  complexity  of  the  structure  to  be  analyzed  leads  to  a  variety 
of  considerations  about  information  gathering  and  processing  schemes. 
Information  gathering  and  processing  cannot  be  separated  into  independent 
subproblems  because  of  constraints  imposed  on  experimental  efforts,  such  as 
stimulus  complexity,  signal  recording,  and  computational  complexity.  Thus, 
depending  on  special  objectives,  different  compromises  are  sought. 

In  this  appendix  we  will  roughly  outline  several  methodologies 
currently  in  use  for  VER/EEG  analysis,  including  a  discussion  of  the  gross 
experimental  structure,  specific  problems,  experiment  design  with  different 
objectives,  and  the  "classical"  as  well  as  more  modern  methods  of  signal 
analysis.  Physiological  and  physical  considerations  will  be  investigated  in 
Appendix  B. 


SETUP  OF  VER/EEG  ANALYSIS 


A  schematic  of  a  typical  VER/EEG  analysis  system  is  shown  in  Figure  A-l , 
and  the  flow  of  information  is  indicated. 

Stimulus:  the  discussion  of  the  flow  of  information  in  the  arrangement 
of  Figure  A-l  may  naturally  be  started  with  the  proDerties  of  the  test 
stimulus  since  historically  the  feedback  path  was  rarely  used.  In  most 
instances,  the  stimulus  pattern  is  two-dimensional,  and  basic  geometric 
fiqures  are  used.  The  patterns  (and  background)  have  to  be  well  defined  in 
terms  of  brightness,  onset-offset  (or  time  course),  color,  and  angular  size. 
Care  has  to  be  taken  to  avoid  production  of  simultaneous  acoustic  signals 
such  as  clicks  from  flash  cubes  or  noises  from  static  discharges  of  the  TV 
screen.  With  use  of  a  TV  screen  the  time  constant  of  the  after-imaqe  may  be 
of  some  importance,  since  the  visual  system  may  subconsciously  process 
high-frequency  (>60  Hz)  information  (Desmedt  (28)  o,  44).  In  general,  when 
using  TV  stimuli,  several  technical  characteristics  of  the  images  should  be 
obtained  from  the  manufacturer.  With  these  precautions  in  mind,  the  use  of 
a  TV  screen  is  still  viewed  as  a  very  convenient  and  valuable  source  for 
stimuli  (Desmedt  (28)  p.  8). 

Subject:  a  subject  of  an  experiment  should  be  categorized  following 
a  standard  procedure.  Personal  characteristics  such  as  visual  acuity, 
color  vision,  left-  or  right-handedness  should  be  recorded  as  deemed 
necessary.  Possibly  the  ears  should  be  covered  or  background  noise 
provided  to  mask  event-related  sounds. 

Electrodes  (Leads):,  electrodes  should  be  placed  on  (selected) 
standard  lead  positions.  Electrode  type  (cup,  needle,  capacitive)  and 
the  use  of  electrolyte  oaste  are  desiqn  quantities.  Some  aspects  re¬ 
lated  to  the  choice  of  electrodes  are  discussed  in  Appendix  E. 

These  leads  should  be  routed  as  close  together  as  possible  to  avoid 
pickup  of  external  electromagnetic  or  electrostatic  interference. 

Signal  conditioning:  typically  the  electric  signal  is  fed  into  a 
high-impedance  bandpass  amplifier  which  suppresses  DC  and  frequencies 
above  300  Hz  (in  some  instances,  such  as  spectral  analysis,  mixers,  and 
narrow-band  amplifiers  are  used).  The  choice  of  roll-off  frequencies 
is  usually  determined  by  the  signal -versus-noise  bandwidth  (where  "signal" 
and  "noise"  are  subject  to  interpretation,  as  discussed  in  Appendix  B). 

Signal  conversion:  the  analog  output  of  the  signal  conditioner  is 
fed  into  an  A/D  converter  which  typically  measures  several  leads  (or 
channels)  virtually  simultaneously.  The  important  characteristics  of  A/D 
converters  include  dynamic  range,  sampling  rate,  and  the  stability 
of  the  sampling  rate.  Note:  actual  sampling  intervals  may  not  follow 
precisely  scheduled  intervals  when  driving  the  sampling  process  through 
software  executive  commands.  Digitizing  a  known  waveform  is  thus  recom¬ 
mended  for  performance  evaluation. 
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Fioure  A-l .  VER  test  arrangement 


Computer:  the  digitized  data  is  typically  buffered  and/or  processed 
in  the  comDuter.  Since  information  flow  from  many  leads  may  lie  in  the 
kbyte/sec  (8  bits/byte)  range,  or.-line  processing  of  all  the  data  with 
sophisticated  nondedicated  software/hardware  would  be  extremely  diffi¬ 
cult  if  reguired.  SimDle  processing  of  the  data  (for  example,  detection 
of  gross  artifacts  or  lack  of  signal)  may  well  proceed  in  real  time,  while 
more  advanced  signal  analysis  can  be  performed  off-line  on  stored  data. 
This  data  may  be  stored  on  either  magnetic  tape  or  disc.  No  difficulty 
due  to  limited  transfer  rates  to  the  storage  devices  is  anticipated. 

During  the  recording,  the  computer  may  generate  various  test  stimuli, 
possibly  conditioned  on  a  simple  analysis  of  the  data.  Hence  the  loop 
shown  in  Figure  A-l  may  be  closed  in  real  time.  An  example  for  such  a 
structure  is  given  by  Vidal  (114). 


SPECIFIC  PROBLEMS  IN  VER/EEG  ANALYSIS 

Blinking,  saccades,  and  lack  of  attention  are  some  of  the  most 
deleterious  disturbances  in  VER  analysis  when  neglected.  By  means 
of  separate  information  channels  some  compensation  of  these  disturbanres 
may  be  accomplished.  For  example,  blinking  and  saccades  are  monitored 
with  electrodes  placed  near  the  eye;  the  quick  potential  charges  asso¬ 
ciated  with  these  events  provide  reliable  timing  information  to  change 
the  mode  of  analysis.  Attention  may  be  judged  by  separate  human  observer 
via  a  monitor.  Clearly,  the  need  for  a  human  observer  should  be  eli¬ 
minated  as  much  as  possible. 

The  background  EEG  activity  during  VER  studies  provides  a  different 
form  of  disturbance.  For  example,  the  stationarity  of  EEG  activity  cannot 
be  assumed  following  a  stimulus.  Indeed,  there  is  indication  of  EEG 
activity  entrainment  or  modulation  by  processes  following  stimulation  (Lansing 
and  Barlow  (61)).  Since  EEG  activity  appears  to  be  nonstationary  in 
itself,  it  is  hard  to  distinguish  between  random  changes  with  superimposed 
VERs  and  nonlinear  interaction  between  random  changes  and  responses 
to  a  stimulus. 

Fatigue  and  adaptation  pose  further  problems,  at  least  when  stationary 
technioues  are  beinq  used  for  signal  analysis.  Adaptive  or  piecewise 
stationary  techniques  are  necessary  to  deal  with  this  problem.  The  par¬ 
ticular  choice  of  adaptivity  or  segmenting  requires  some  prior  information 
about  system  behavior,  especially  for  real  time  analysis;  this 
prior  knowledge  may  be  represented  in  the  form  of  a  particular  heuristic. 

A  difficulty  on  the  hardware  side  in  VER/EEG  analysis  might  arise 
from  uncertainty  of  electrode  positioning  when  experiments  are  to  be 
repeated  or  compared  among  subjects.  One  possibility  to  cope  with  this 
oroblem  is  to  first  study  positioning  sensitivity  and  selected  points  which 
appear  to  be  insensitive.  Alternatively  one  may  position  electrodes  so 
as  to  obtain  a  specific  transfer  characteristic  from  the  stimulus  to  the 
electrode.  For  example,  signal  processing  tools  may  be  used  to  qive  an 
indication  of  correct  electrode  placement.  A  completely  different  approach 
miqht  involve  photographic  documentation  or  other  more  "physical  means" 
to  generate  reproducible  electrode  placements. 
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CLASSICAL  METHODS  FOR  VER/EEG  ANALYSIS 


The  two  classical  methods  of  VER/EEG  analysis  are  based  on  averaging 
and  sDectral  analysis.  These  techniques  were  developed  in  other  sciences, 
spectral  analysis  esoecially  in  engineering  in  the  context  of  oscillating 
systems.  These  techniques  are  widely  used  today  for  the  analysis  of  bio- 
loqical  signals,  mai^y  because  of  their  well  understood,  usually  fairly 
simple  orooerties  and  simple  implementation.  They  should  be  regarded 
as  important  techniques  proved  successful  in  a  variety  of  areas.  In  some 
situations  the  methods  are  reasonably  easily  extended  to  approximately 
describe  nonlinear  and  "moderately"  nonstationary  processes. 

It  should  be  noted  that  for  the  special  case  of  analyzing  linear 
processes,  averaging  and  comb  filtering  (one  of  the  spectral  techniques 
analyzing  a  siqnal  at  integer  multiples  of  the  stimulus  frequency)  are 
intimately  related  via  the  Fourier  transform. 

These  classical  methods  of  signal  analysis  may  be  regarded  as  nonparametric. 
They  provide  a  simple  tool  to  describe  input-output  relations  of  dynamical 
systems.  Often  these  techniques  are  inexpensively  implemented  on  analog 
circuitry. 


RECENT  METHODS  FOR  VER/EEG  ANALYSIS 

With  the  advent  of  the  computer  and  fast  analog-to-digital  converters, 
more  complex  techniques  for  the  analysis  of  signals  became  feasible.  These 
techniques  are  especially  useful  for  treating  the  randomness  of  signals  in 
natural  processes.  The  Fast  Fourier  Transform,  autoregressive  modeling, 
and  Karhunen-Loeve  expansion  are  among  the  most  prominent.  For  the  purpose 
of  spectral  estimation  the  maximum  entropy  approach  has  drawn  much  atten¬ 
tion.  From  the  usual  specification  of  the  entropy  induced  by  a  filter 
(following  Bartlett  (9))  this  form  of  spectral  estimation  leads  essen¬ 
tially  to  autoregressive  modelinq.  A  close  "relative"  of  the  Karhunen- 
Loeve  exDansion  is  al  so  well  known  in  statistical  analysis  in  slightly 
different  setting  under  the  title  of  principal  component  analysis.  All 
of  these  methods  are  based  on  relatively  strong  linearity  and  stationarity 
assumptions.  The  use  of  these  methods  is  widespread  today  and 
common  in  pattern  classification. 

It  appears  to  us  that  many  recent  techniques  are  not  fully  ex¬ 
ploited,  in  particular,  techniques  which  allow  modifications  to  test 
non! inearities  and  certain  forms  of  nonstationarities.  For  example, 
one  of  the  important  contributions  of  Box  and  Jenkins  (14)  was  to 
develop  systematic  approaches  to  the  use  of  time-series  analysis  in  the 
time  domain  (with  some  reference  to  spectral  representation)  which  can 
be  followed  by  the  statistical  layman.  The  method  advocated  is  the 
so-called  autoregressive-movinq  average  technique  which  can  describe 
all  possible  (finite  dimensional)  linear  time  processes.  Yet  for  reasons 
mentioned  in  this  appendix  and  discussed  in  more  detail  in  Appendix  B,  the 
modelinq  of  VER/EEG  may  call  for  more  general  forms  of  nonlinearities 
and  nonstationaritiesthan  those  outlined  by  Box  and  Jenkins  (14). 

Scientific  Systems,  Inc.  has  developed  extensive  expertise  and  automatic 
software  to  perform  such  analysis.  In  fact,  the  routines  available  to  us 
are  still  more  general  in  terms  of  modeling  nonstationarity  and  nonlinearity. 
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The  area  of  system  identification  in  the  engineering  sciences  has, 
of  necessity,  addressed  the  issues  of  fairly  strong  nonlinearities  and 
nonstationarities.  Important  developments  which  allow  consideration  of 
these  issues  are  extensions  of  the  Kalman  filter  algorithm  and  modern 
control  theory.  These  techniques  are  of  vital  importance  today  in  air¬ 
craft  trajectory  estimation  and  control.  As  a  supplement,  describing 
function  analysis,  which  developed  in  a  rather  straightforward  fashion 
from  classical  spectral  analysis  and  from  certain  statistical  concepts, 
provides  a  good  tool  for  modeling  some  of  the  interesting  features  of 
nonlinear  systems  such  as  the  mixing  of  frequencies,  limit  cycles,  sub¬ 
harmonics,  and  entrainment.  Tor  the  successful  application  of  these 
techniques  considerable  computational  effort  may  be  required..  The  high  com¬ 
putational  burden  is  in  qood  part  due  to  the  iterative  nature  for  solvinq 
the  nonlinear  equations  associated  with  parameter  estimation  in  these 
systems.  The  techniques  in  use  today  often  require  searches  in  a  10-  to 
50-dimensional  parameter  space  to  firtd  the  best  fit. 

Thus  a  variety  of  practical  and  theoretical  problems  related  to 
numerical  accuracy  and  uniqueness  of  solutions  arise  for  the  researcher. 

To  deal  effectively  with  these  problems  one  should  proceed  stepwise  in 
the  augmentation  of  models,  and,  in  our  opinion,  as  much  as  possible  start 
out  with  "meaningful"  models.  This  approach  can  save  considerable  amounts 
of  comoutation  since  it  allows  incorporation  of  prior  knowledge  about 
structure  into  the  models.  Diagnostic  checking  may  then  lead  to  approval, 
modification,  and  possibly  augmentation  of  conjectured  structures.  In 
some  instances  one  should  also  consider  whether  adding  alternative  forms 
of  measurements  may  reduce  computations  through  improved  observability 
of  parameters  or  decomposition  of  a  model  into  simpler  structures,  thus 
simplifying  the  multidimensional  searches.  To  show  some  of  the  possible 
considerations  in  this  context.  Appendix  B  is  devoted  to  an  investigation 
of  the  physiological  and  physical  structure  of  VER/EEG  analysis. 
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APPENDIX  B 


PHYSIOLOGICAL  AND  PHYSICAL  CONSIDERATIONS  IN  VER/EEG  SIGNAL  ANALYSIS 

In  this  aopendix,  we  will  give  an  outline  of  the  overall  signal 
structure  under  the  special  consideration  of  its  physiological  and 
physical  origin.  We  start  out  with  a  review  of  physiological  mechanisms 
resoonsible  for  electrical  potentials  and  potential  changes  and  turn 
then  to  what  we  term  the  Eye-Brain-Electrode  model.  This  model  is 
motivated  by  the  different  forms  of  information  currently  available 
about  the  eye,  the  brain,  and  properties  of  electrodes.  In  particular, 
such  a  partitioned  model  may  be  useful  in  analyzing  specific  saturations 
that  take  place  in  flash  responses.  Current  literature  appears  to  have 
disregarded  this  overall  view  of  VER/EEG  signals.  Finally,  the  structure 
of  the  experimental  setup  is  reviewed  as  it  may  pertain  to  the  experiment 
design  and  data  acquisition. 


THE  ELECTRICAL  ACTIVITY  OF  LIVING  CELLS 

Living  cells  require  an  electric  potential  difference  across  the  cell 
membrane.  This  bi polymer  sheet  of  about  100  A  strength  is  by  itself 
not  caDable  of  withstanding  the  high  osmotic  pressure  of  proteins  in  the 
cell  interior;  that  osmotic  pressure  is  balanced  by  the  osmotic  pressure 
and  electric  force  of  ions.  Special  sodium  pumps  shuttle  sodium  ions 
toward  the  cell  exterior  rendering  the  inside  negative.  Some  of  the  ions 
leak  back  and  hence  the  pumps  must  remain  active. 

Nerve  cells  utilize  this  potential  difference  to  propagate  actively 
(that  is,  in  a  regenerative  fashion)  variations  of  the  potential  along 
the  cell  body.  These  potential  variations  are  regarded  as  action  poten¬ 
tials,  and  they  consist  of  a  short  reversal  of  the  polarity  of  the  (local) 
cell  interior.  Typically  the  duration  for  such  an  action  potential  is 
in  the  1-msec  range,  and  the  spatial  length  of  the  potential  reversal 
in  the  centimeter  range.  Propagation  speeds  vary,  depending  on 
myelination  and  diameter  of  an  axon.  The  action  potential  follows  an 
all-or-nothing  law,  and  results  always  in  a  signature  of  the  line  and 
snace  distribution  (Abeles  (1)). 

The  initiation  of  an  action  potential  may  occur  in  several  ways, 
for  example,  via  sDecial  i zed  receptor  cells  or  from  another  nerve  cell 
via  the  dendritic  tree.  This  dendritic  tree  serves  as  an  approximate 
integrator  of  postsynaptic  currents  in  time  and  space.  These  postsynaptic 
currents,  in  turn,  result  from  chemical  transmitter  substances  released 
by  the  synapses  of  other  nerves.  The  effect  of  potential  changes  may 
either  facilitate  or  inhibit  the  possibility  of  generating  an  action 
potential.  The  delay  associated  with  the  transmission  of  information 
from  one  nerve  to  the  other  may  be  as  short  as  1  msec  per  synapse- 
dendrite  "relay."  Thus  response  time  can  give  a  clue  about  the  number 
of  sequential  relays  and  hence  complexity  of  a  neural  pathway. 

In  contrast  to  the  unit  of  information,  the  action  potential  which 
travels  alonq  a  nerve  axon,  no  such  unit  exists  in  the  dendritic  tree. 

As  mentioned  above,  the  dendritic  tree  functions  more  in  an  integrative 
linear  fashion.  Small  inputs  into  many  of  these  dendritic  trees  might 
thus  be  expected  to  result  in  changes  of  neural  interaction.  This  view 
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is  supported  by  observed  behavior  changes  in  animals  and  the  selective 
release  of  calcium  ions  (Adey  {4})  in  brain  tissue  by  means  of  low- 
amplitude  electric  fields.  He  suqgests  demodulation  of  modulated 
high-frequency  fields  occurs  on  the  (asymmetric)  bipolymer  sheet  of  cell 
membranes,  inducing  changed  transmembrane  potentials.  This  observation 
leads  to  the  concept  of  slow  electric  potential  changes  serving  as  a 
possible  second  (electric)  message  system. 

The  nerve  cells,  composed  of  the  dendritic  tree,  axon,  and  terminal 
branches,  constitute  the  majority  of  cells  in  the  brain.  They  are  or¬ 
ganized  in  bundles  for  siqnal  transmission  and  in  nuclei  for  information 
processing.  The  arrangements  of  dendrites  and  terminal  branches  appear 
to  follow  a  random  pattern  (Abeles  (1)).  Electric  activity  among  the 
various  nerve  cells  also  appears  to  follow  random  time  processes,  though 
not  independent  among  nerve  cells  and  with  respect  to  stimuli. 

In  recording  the  electric  activity  of  the  brain  through  EEG  and  VER, 
it  is  the  random  superposition  of  extracellular  farfields  which  is  observed, 
since  typically  electrodes  are  separated  from  active  cells  by  centimeters. 

It  should  be  said  that  it  is  not  clear  today  how  much  the  voltage  fluc¬ 
tuations  observed  result  from  action  potentials  and  slow  potential  chancres 
in  dendrites,  but  both  mechanisms  are  implicated. 

Having  set  out  some  of  the  important  aspects  of  bioelectrical  poten¬ 
tials,  we  may  turn  tn  the  eye-brain-electrode  model  and  discuss  the  effect 
of  these  three  components  on  the  recorded  siqnal. 


THE  EYE-BRAIN-ELECTRODE  MODEL 

The  concept  of  the  eye-brain-electrode  model  was  motivated  by  the 
investigation  of  the  stimulus-response  path  and  the  different  properties 
and  forms  of  prior  knowledge  about  the  components  along  that  path.  The 
eye  appears  to  be  physiologically  fairly  well  understood  in  terms  of  its 
transmission  properties  and  control.  Thus  some  physiological  modeling 
is  suggested  for  the  sake  of  stimulus  design  and  transmission  characteri¬ 
zation. 

In  comparison  with  the  eye,  the  brain  is  functionally  extremely 
complex  and  information  for  the  characterization  of  signal  transmission 
is  very  incomplete.  In  addition,  the  neuroanatomy  of  the  brain  is  highly 
species-specific,  prohibiting  simple  extrapolation  to  other  species.  For 
example,  the  neuroanatomic  structure  of  layers  in  the  visua.  cortex  of 
human  and  other  closely  related  primates  is  quite  different  in  quality 
and  number.  Hence,  for  the  purpose  of  description,  one  is  forced  to 
resort  to  abstract  models  which  may  not  resemble  the  underlying  structure. 
Nevertheless,  some  information  about  potential  pathways  and  processing 
mechanisms  is  available  and  should  be  considered  in  the  selection  and 
comparison  of  models. 

Finally,  it  is  realized  that  all  signals  recorded  are  affected  by  the 
transmission  properties  and  location  of  the  electrode.  Hence  they  call 
for  separate  consideration.  Proper  use  of  electrodes,  lead  placement,  and 
impedance  matching  may  lead  to  improved  signal  quality  and,  potentially, 
to  new  information. 
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The  Physiological  Model  of  the  Eye 


The  anatomy  of  the  e.yc  is  shown  i 
ponents  aopear  to  be  important  factors 


Figure  B-l  (after  Leibovic  (63)). 
and  its  luminance  efficiency. 


Fiqure  B-l.  Several  of  its  com- 
in  the  signal  transformation  from 
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Optical  arrangement  in  human  eye 


an  optic  imaqe  to  the  optic  nerve:  the  iris,  the  lens,  and  the  retina. 

The  imDortance  of  the  iris  is  due  to  not  only  its  well-known  control  of 
illumination  and  consequent  effect  on  image  quality,  but  also  its  rather 
stochastic  behavior.  Stanten  and  Stark  (105)  demonstrated  the  consider¬ 
able  random  fluctuations  of  pupil  area  at  different  levels  of  illumina¬ 
tion;  amplitudes  may  fluctuate  about  20%  (1  S.D.)  and  with  a  time  constant 
of  several  seconds  (cf.  Figures  B-2  and  B-3).  They  also  noted  the  strong 
correlation  of  left  and  right  pupil  area  noise.  This  suggests  a  common 
pathway  for  the  processing  of  illumination  information,  and  they  give 
a  stochastic  model  for  these  processes. 

Following  the  oath  of  liqht,  the  lens  is  the  secorj  element-modu¬ 
lating  image  quality.  Stanten  and  Stark  (105)  showed  a  dynamic  limit 
cycle  behavior  of  the  focal  lenqth  of  the  lens.  For  this  they  developed  a 
deterministic  nonlinear  control  model  which  accounts  for  oscillations  around 
2  Hz  (cf.  Figures  B-4  and  B-5).  An  obvious  purpose  of  this  system  is  to  track 
focusing  by  testing  blurs  on  the  retinal  image,  much  like  automatic  man-made 
systems  do.  These  oscillations  appear  to  be  superimposed  by  a  1/f  type 
(flicker)  noise  at  still  lower  frequencies.  For  understanding  properties 
of  the  retinal  imaqe  one  should  be  aware  of  these  processes,  especially 
since  they  are  stronq  enouqh  to  drive  an  internal  servo  mechanism. 

Another  effect  on  retinal  imaqe  associated  with  the  lens  results  from 
its  stronq  chromatic  aberration.  Thus  not  all  colors  are  simultaneously 
in  focus,  and  typically  a  2-diopter  myopic  correction  (50-cm  negative 
focal  lenqth)  is  necessary  to  focus  blue  when  red  is  focused  (Desmedt 
(28)).  Recall  the  common  experience  of  the  glare  of  blue  lights  (e.g., 
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from  emergency  vehicles)  at  night.  With  the  above-mentioned  oscillatory 
lens  control,  focusing  of  different  colors  at  different  times  occurs. 

Also,  due  to  the  geometry  of  the  retina  and  the  lens,  sharp  images  are 
confined  to  the  macula  lutea  with  its  fovea  centralis.  However,  the 
area  in  focus  may  move  (expandina  and  contracting)  in  connection  with 
lens  oscillations. 

The  retina,  the  organ  which  receives  the  optic  image,  is  equipped 
with  a  large  number  of  different  receptors.  Rods  serve  night  vision 
(scotopic  vision)  and  three  types  of  cones  serve  daylight  color  vision 
(chromatic  photopic  vision).  The  dynamic  range  of  this  system  is  about  5 
orders  of  magnitude  in  light  intensity,  far  in  excess  of  the  dynamic  range 
of  the  pupil,  but  with  much  slower  adaptation  (Fiqure  B-6). 

The  highly  regular  arrangement  of  cells  in  the  retina  has  an  in¬ 
teresting  effect  on  the  electrical  properties  of  the  eye:  it  renders  it 
an  approximate  dipole  with  the  dipole  moment  approximately  aligned  with 
the  optic  axis.  This  property  can  be  exploited  to  determine  (within 
limits)  eye  position  or  at  least  eye  movement. 

For  the  purpose  of  investigating  quick  dynamic  changes  of  illumina¬ 
tion,  the  modeling  of  the  kinetics  of  pigment  synthesis  in  rods  and  cones 
is  important  since  the  recurrence  of  vision  secondary  to  flash  stimulation 
is  limited  by  these  kinetics  (Leibovic  (63)).  The  kinetics  are  probably 
different  for  rods  and  cones,  and  possibly  even  different  for  cones  with 
different  color  pigments. 

The  lateral  interaction  between  visual  receptors  is  of  special 
importance  in  color  contrast  experiments.  Fortunately,  in  the  case  of 
the  human  retina,  there  seem  to  be  no  efferent  neural  networks.  In  cats 
such  efferent  networks  voluntarily  change  retinal  performance. 

In  addition  to  the  above-mentioned  nonlinearities  and  nonstationari- 
ties,  a  different  kind  of  static  nonlinearity  is  described  by  Leibovic 
(63).  For  intermediate  light  levels  he  supports  the  logarithmic-type 
Weber-Fechner  law.  Howev:r,  for  very  low  light  levels  a  square  root  law 
is  theoretically  and  experimentally  more  appealing.  He  also  discusses 
the  expected  deviations  from  the  Weber-Fechner  law  for  very  high  light 
levels.  These  nonlinearities  are  interesting,  because  they  allow  the 
stimulus  design  to  be  such  that  the  input  to  the  optic  nerve  follows  an 
arbitrary  function,  possibly  a  sinusoid. 

Another  interesting  effect  associated  with  the  retinal  image  processing 
is  its  superresolution  of  lines:  resolution  of  lines  is  not  limited  by 
the  spacing  of  receptors.  Instead,  local  spatial  interaction  via  some 
sort  of  averaging  allows  considerably  higher  resolution. 

Finally,  we  should  consider  changes  of  retinal  image  due  to  gross 
eye  movement.  Three  mechanisms  should  be  distinguished:  a  smooth  system 
for  smooth  pursuit,  a  saccadic  system  for  fast  positioning  and  correction 
of  errors,  and  a  slow  drift  of  the  optical  axis.  Interestingly,  the 
dynamics  for  the  horizontal  and  vertical  system  are  quite  different..  To 
avoid  the  blurring  of  images  during  saccades,  visual  perception  is  reduced 
(even  some  time  after  the  movement  is  completed  until  the  "wobbling" 
of  the  eyeball  has  sufficiently  died  out),  for  a  total  of  about  30  msec 
(Leibovic  (63.  P-  US)). 
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Figure  B-2.  Schematic  of  autocorrelation  of  pupil  noise. 

The  correlation  between  the  two  eyes  is  up  to  95 


Figure  B-3.  Schematic  of  distribution  of  pupil  area  at 
constant  illumination.  Pupil  area  is  also 
affected  by  various  reflex  mechanisms. 


Figure  B-4.  Power  spectrum  of  focal  length  of  lens 
when  sharp  images  are  presented. 


1/SFR 


f - f  (Diopters 


Figure  B-5.  Schematic  of  spatial  frequency  resolution  (SFR) 
versus  focal  length  of  lens  f. 
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Figure  B-6  (after  Leibovic  (63)).  Variation  nf  lioht  Hpf.prtion 
threshold  due  to  dark  adaptation. 


The  drift  phenomena,  following  Leibovic  (63),  is  similar  to  Brownian 
motion  of  the  optical  axis.  When  certain  critical  limits  of  the  positioning 
error  are  exceeded,  correcting  saccades  are  invoked  automatically. 

A  different  type  of  optical  signal  is  created  by  bl inking.  Elec¬ 
trical  potentials  associated  with  this  muscle  activity  might  be  detected 
by  electrodes  placed  near  the  eye;  these  electrodes  may  simultaneously  be 
used  to  detect  saccadic  eyeball  movements.  Since  the  mechanisms  for  gener¬ 
ating  these  potentials  are  somewhat  different,  discrimination  may  be 
possible. 

Finally,  the  stereoscopic  vision  should  be  considered.  The  long 
evolution  of  the  visual  system  seems  to  have  led  to  a  strong  integration 
of  individual  components.  Thus,  Stark  (personal  note)  found  a  correlation 
of  the  anqle  of  the  optic  axis  with  the  focusing  of  the  lens.  The  angle, 
in  turn,  is  driven  by  the  difference  in  retinal  images.  He  also  discusses 
the  importance  of  considering  the  rotation  of  the  eyeball  around  the 
optic  axis  during  these  dynamic  maneuvers.; 
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Modeling  the  Brain 


The  fast  processing  of  information  within  the  brain  is  based  on 
electrochemical  processes.  Some  of  these  processes,  such  as  neural  action 
potentials,  are  of  short  duration  and  fixed  waveform  but  occur  with 
variable  time  intervals.  The  intervals,  even  though  random,  are  not 
independent.  Thus,  even  thouqh  individual  responses  to  stimuli  seem  to 
fluctuate,  there  is  a  consistent  relation  between  them.  In  particular, 
in  the  case  of  VERs,  variable  amplitudes  and  latencies  are  observed. 

Little  is  known  about  the  dynamics  of  these  variations,  but  their  impor¬ 
tance  for  signal  enhancement  is  recognized  (McGillem  and  Aunon  (69)).  In 
some  situations,  as  in  acoustic  click  stimuli  experiments,  a  relation 
between  signal  amplitudes  and  latencies  was  found  (Moller  (72)),  in 
agreement  with  elementary  models  of  synaptic  siqnal  transmission  (Eccles 
(36)).  The  establishment  of  such  models  and  their  incorporation  into 
estimation  schemes  may  be  very  helpful  in  developing  powerful  and 
efficient  techniques. 

Some  of  the  difficulties  which  are  suggested  by  our  understanding  of 
the  functioning  of  the  brain  should  also  be  emphasized..  For  example, 
the  transmission  of  signals  is  apparently  accomplished  via  different  path¬ 
ways,  each  "modulating"  the  signal  in  its  own  fashion  in  terms  of 
amplitudes  and  latencies,  possibly  even  influenced  by  some  of  the  other 
pathways.  An  example  of  this  mechanism  is  the  processing  of  different 
colors  in  "color  channels"  (Regan  (87))  and  the  complex  evoked  sensation 
(Land  (60)). 

Another  diffuculty  arises  due  to  the  nonlinear  behavior  of  infor¬ 
mation  processing:  in  some  cases,  it  is  not  clear  whether  evoked  poten¬ 
tials  result  from  the  stimulus  directly  or  from  the  modulation  (en¬ 
trainment)  of  other  electrical  activity  of  the  brain,  such  as  the 
modulation  of  the  EEG  (Lansing  and  Barlow  (61)). 

One  of  the  potentially  interesting,  but  more  speculative  aspects  of 
the  EEG  and  VER  analysis,  is  the  consideration  of  frequencies  above  the 
currently  used  values  (say  between  50  Hz  and  1000  Hz).  It  is  usually 
argued  that  the  power  of  the  spectrum  falls  off  quickly  above  50  Hz,  hence 
higher  frequencies  do  not  represent  much  information.  From  an  information 
theoretic  point  of  view,  this  interpretation  is  incorrect.  The  crucial 
quantity  to  be  looked  at  is  the  signal-to-noise  ratio.  As  mentioned 
earlier,  only  thermal  electronic  noise  can  safely  be  regarded  as  noise. 
Another  interesting  aspect  is  the  possible  effect  of  a  dielectric 
constant  of  the  brain  on  high-frequency  transmission.  For  low  fre¬ 
quencies  currently  studied,  this  dielectric  effect  can  safely  be  dis¬ 
regarded  (Desmedt  (28));  for  hiqher  frequencies  up  to  the  kHz  range, 
however,  we  have  not  yet  found  relevant  information.  The  interest  in 
high  frequencies  is  due  to  1)  their  existence:  the  spectrum  of  action 
potentials  reaches  into  the  kHz  range,  and  2)  their  possible  value  in 
locating  signal  sources.  Globally,  increasing  the  signal  bandwidth  of 
analysis  implies  increased  information  flow. 

On  physical  grounds,  one  may  expect  a  different  character  in  signals 
at  high  frequencies:  first,  because  of  the  above-mentioned  dielectric 
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effect;  second,  because  of  the  loss  of  the  phase  relation  with  respect 
to  stimuli  as  a  result  of  latency  variations  (often  called  jitter).  The 
different  properties  may  call  for  different  processing  techniques,  but 
also  preprocessing  for  digitizing  and  for  the  design  of  electrodes 
and  preamplifiers.  The  following  section  should  serve  to  clarify  some 
of  these  considerations. 


Considerations  About  Electrodes 

Roughly,  the  use  of  electrodes  is  characterized  by  1)  their  electro¬ 
chemical  characteristics  (type),  2)  number  used,  3)  placement,  and  4)  size. 
The  important  electric  properties  of  (external)  electrodes  are  well 
described  by  the  linearized  characterization  of  the  electric  half  cell. 

They  imply  an  interesting  small  signal  frequency  character:  for  in¬ 
creasing  frequencies  their  impedance  reduces  (Figure  B-l).  Many  inves¬ 
tigators  dislike  this  frequency-dependent  characteristic  of  electrodes 
and  reduce  its  effect  by  the  use  of  a  high-input  impedance  amplifier. 
However,  increased  input  impedance  results  in  an  effective  thermal 
electronic  noise  power  (excluding  flicker  noise  at  the  moment)  in  the 
final  signal,  proportional  to  that  input  impedance.  For  the  purpose  of 
sophisticated  signal  analysis  such  noise  sets  a  limit  for  the  perfor¬ 
mance  of  any  scheme,  but  the  frequency  dependency  does  not.  In  fact, 
since  the  electrode  characteristics  can  easily  be  modeled,  they  would 
not  significantly  decrease  performance  of  sophisticated  signal  analysis. 

For  the  purpose  of  high  frequency  measurements  with  a  given  sampling 
interval  of  the  analog-to-digital  converter  a  frequency  shift  is  necessary. 
That  shift  may  be  done  at  the  output  of  the  analog  preamplifier.  It  should 
be  mentioned  that  a  separate  preamplifier  may  be  desirable  for  high- 
frequency  amplification  to  match  the  lower  electrode  impedance  in  that 
frequency  range  (compare  Figure  B-7).  A  consequence  of  the  use  of  low- 
input  impedance  amplifiers  i s an  increased  sensitivity  to  electrode  and 
scalp  interface  impedance  changes.  This  sensitivity  may  be  overcome  by 
periodic  (recalibration)  sampling  of  the  impedance,  but  care  must  be 
taken  to  limit  the  calibration-signal  amplitude  to  avoid  interference 
with  brain  activity. 

The  number  and  placement  of  electrodes  are  interesting  experiment 
design  features.  The  number  may  be  limited  by  convenience  of  application, 
available  preamplifiers,  and  the  analog-to-digital  conversion  capability 
of  the  computer.  Data  storage  may  impose  further  practical  limits  on  the 
number  of  electrodes  used.  The  placement  should  clearly  follow  some 
anatomical  considerations  about  the  origin  of  various  signals.  The 
determination  of  "good"  locations  may  proceed  interactively  with  the 
signal  analysis.  In  this  context  the  issue  of  placement  sensitivity  and 
reproducibility  plays  an  important  role.  The  location  of  the  ground, 
or  reference,  electrode  is  also  important  in  specifying  the  differential 
potentials  actually  measured  (see  Cobbold  (26,  p.  431)). 
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Figure  B-7a  (after  Cobbold  (26)).  frequency  and  size  dependence  of 
impedance  in  skin  electrodes. 
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Figure  B-7b  (after  Cobbold  (26)).  Frequency  dependence  of  complex 
impedance  in  skin  electrodes. 


The  size  of  electrodes  seems  to  be  a  further  important  design  factor. 
Clearly,  for  low  impedances  which  reduce  thermal  electronic  noise,  large 
diameters  are  desirable.  Even  though  spatial  resolution  may  be  lost, 
there  are  some  indications  that  information  for  the  purpose  of  response 
classification  may  not  so  much  be  contained  in  spatial  characterization 
(Squires  and  Donchin  (104)).  In  that  study  of  schemes  which  classified 
different  stimuli  based  on  evoked  responses,  a  classifier  based  on  a 
linear  superposition  of  tracings  from  different  leads  performed  nearly 
as  well  as  the  "optimal"  classifier  (a  Karhunen-Loeve  representation  was 
used  in  both  cases).  Furthermore,  assuming  dipole  models  for  generators 
of  electric  fields,  the  distance  of  electric  dipoles  to  the  electrode 
should  affect  the  choice  of  electrode  use.  A  pragmatic  approach  might  be 
based  on  trying  different  size  electrodes  and  determining  performance  in 
conjunction  with  various  signal  analysis  schemes. 

In  summary,  the  current  methods  of  data  acquisition  and  signal  analysis 
should  be  rethought  in  view  of  the  flexibility  and  adaptability  of  modern 
computerized  signal  analysis.  Demands  different  from  traditional 
VER  and  EEG  analysis  are  present  today. 


APPENDIX  C 


FREQUENCY  DOMAIN  SIGNAL-PROCESSING  TECHNIQUES 


DIGITAL  PROCESSING  CONSIDERATIONS 

Several  practical  considerations  arise  in  the  digital  processing  of 
analog  (contiguous)  data.  These  issues  concern  the  relationship  between 
the  digital  numbers  being  computed  and  the  analog  data,  spectrum,  or  other 
information  originally  sought.  Excellent  general  reference'  for  this 
subject  are  Oppenheim  and  Schafer  (76)  and  Papoulis  (78). 


Sampling 

An  analog-to-digital  (A/D)  converter  accepts  a  continuous  input  and 
creates  a  discrete  (in  time  and  range)  output.  Several  types  of  A/D 
converters  and  sampling  circuits  are  available,  but  the  casual  user  is 
most  concerned  with  the  rate  of  sampling  (discrete  time)  and  quantization 
(discrete  level  or  state)  of  the  digital  work.  The  usual  precaution  taken 
in  digital  processing  is  to  ensure  that  the  sampling  rate  is  above  the 
Nyquist  frequency,  which  is  twice  the  maximum  frequency  in  the  signal 
being  analyzed.  If  this  is  true,  then  no  "aliasing,"  or  folding  of  high 
signal  frequencies  into  low-frequency  digital  artifacts,  will  occur.. 

To  avoid  al iasing  when  hiqh-frequency  noise  (or  signal)  is  present, 
an  anti-aliasing  filter  must  be  placed  before  the  A/D  converter.  The 
corner  frequency  of  the  filter  should  be  above  the  maximum  desired  signal 
bandwidth  and  at  least  a  factor  of  2  below  the  sampling  frequency.  A 
factor  of  3  or  4  is  usually  desirable  since  the  anti-aliasing  filter 
passes  some  noise,  although  attenuated,  above  its  corner  (-3dB)  frequency. 
If  enough  is  known  about  the  filter  and  noise  properties,  the  exact 
consequences  of  aliasing  can  be  computed. 

The  second  effect  of  sampl ing— quantization  or  discretization  of  the 
signal  level  (ampl itude)--is  harder  to  examine.  In  general,  the  quanti¬ 
zation  should  be  fine  (small)  enough  so  that  the  discrete  signal  is  an 
accurate  representation  of  the  analog  process.  The  adequacy  of  the  rep¬ 
resentation  may  be  analyzed  by  assuming  that  uniformly  distributed  quan¬ 
tization  errors  are  added  to  the  desired  signal.  A  second  quantization 
error  occurs  in  the  processing  of  digital  data.  This  error  is  harder  to 
analyze,  but  bounds  have  been  developed  for  FFT  algorithms  as  a  function 
of  the  number  of  data  points  (and  therefore  multiplications)  that  are 
used. 

In  general ,  the  cost  and  speed  of  very  accurate  A/D  converters  pre¬ 
vent  their  use  in  many  applications,  and  the  computer  work  lengths  are 
much  longer  than  the  input  signal.  This  extra  computer  work  length  is 
desirable,  however,  since  processing  increases  the  required  word  length 
(e.g.,  during  the  sequential  multiplication  and  subtraction  of  many 
numbers)  at  intermediate  steps  in  the  analysis. 
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Data  Windows 


The  use  of  digital  processors  on  finite-length  data  intervals  results 
in  errors  caused  by  the  sudden  appearance  and  disappearance  of  the  data. 
These  errors  are  greatest  for  short  intervals  and  become  negligible  for 
very  large  data  sets.  The  errors  can  be  reduced,  however,  by  weighting 
the  data  in  a  manner  that  simulates  the  gradual  turning  on  and  off  of 
the  information.  Such  weighting  functions  are  called  windows,  since  they 
represent  the  finite  boundaries  through  which  the  computer  views  the 
(presumably)  infinite  data  stream. 

For  an  input  f{n),  n=0,...,N-l,  the  window  function  a ( n )  >s  used  to 
create  a  processing  input  b(n): 

b(n)  =  f (n)a(n) ,  n=0, . . . ,N-1 

For  long  data  sets,  a(n)  may  be  unity,  forming  a  rectangular  window: 
a(n)  =  1 ,  n=0,. . . ,N-1 


The  simplest  alternate  to  the  rectangular  window  is  the  triangular, 
or  Bartlett,  window: 


r  2" 
N-T 


0  -  n  < 
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A  slightly  better  tyoe  of  window  employs  a  cosine  weighting  to  the  data. 
The  two  most  popular  are  the  Hanning  window 


a(n)  =  %[1  -  cos(j^)],  0<  n  <  N-l 


and  the  similar  Hamming  window 


a(n)  =  0.54  -  0.46  cos(0y),  0  <  n  <  N-l 


The  Blackman  window  provides  even  better  performance,  when  needed,  at  the 
cost  of  a  second  cosine  term: 

a(n)  =  0.42  -  0.5  cos(^y)  +  0.08  cos(— y),  0  <  n  <  N-l 

and  finally,  Kaiser  (54)  suggests  a  family  of  windows  using  the  modified 
Bessel  function  of  the  first  kind  IQ(  ): 
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where  a>  is  an  adjustment  parameter,  usually  in  the  range 

a 


4  < 
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Window  choice  is  usually  a  matter  of  convenience  if  the  data  frame 
size  is  large  enough.  For  very  small  data  blocks,  however,  more  care 
must  be  taken  in  window  design.  Windows  may  be  examined  and  compared  in 
the  frequency  domain,  of  course,  where  some  of  their  features  are  most 
easily  understood.  In  particular,  when  using  windows  to  smooth  periodo- 
gratns  for  spectral  estimation,  some  windows  can  produce  negative  power 
estimates  (for  some  frequencies)  because  of  the  negative  spectrum  of 
portions  of  the  window,  as  discussed  in  Tretter  (109). 


FAST  FOURIER  TRANSFORM  (FFT) 

The  Fast  Fourier  Transform  (FFT)  is  one  of  the  most  popular  signal- 
processinq  techniques.  The  method  gets  its  name  from  one  of  several 
algorithms  available  for  quickly  and  efficiently  obtaining  the  discrete 
Fourier  Transform  of  a  given  time  series.  The  discrete  transform  is 
useful  for  power-spectrum  estimation,  signal  characterization,  signal 
detection,  and  analysis  of  model  performance,  “'r  further  reading,  the 
books  by  Oppenheim  and  Schafer  (76)  and  Tretter  U09)  are  well  written 
and  informative. 


Definition  of  Fourier  Transform 

Given  a  time  series  f(t),  and  a  sampled  version  of  the  signal  f(nT), 
the  Fourier  Transform  (of  the  sampled  signal)  may  be  written  as 


F(ej“)  =  !  f(nT)e~^wnT 

n=-a> 


where  f ( nT )  =  ^  f1'  Ffe'^e"’1’11  dw 

-u 


Discrete  Fourier  Series 


For  periodic  signals  of  period  N  or  limited  sample  signals  x(n), 
n=0,...,N-l  that  may  be  assumed  to  repeat  after  N  steps,  we  may  write 

*<n)  =  iNklo  X(k)eJ^N>"k 


N-l 

X(k)  =  >.  x(n)  e 

n=0 


-j(?n/N)nk 
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,,,m  N-J  ,  x  kn 
X(k)  =  x(n)  mn 

n=0  n 

1  N"]  -kn 

x(n)  =  tj  E  X(k)  uN 

N  k=0  M 

X(k)  may  be  thought  of  as  samples  on  the  unit  circle,  equally  spaced 
in  angle,  of  the  z-transform  of  (one  period  of)  x(n). 

If  the  input  is  thought  of  as  periodic,  then  the  operations 
above  are  usually  called  the  discrete  Fourier  series;  if  the  input  is 
considered  finite  duration,  then  the  operations  are  called  the  Discrete 
Fourier  Transform  (DFT).  The  transform  (with  no  computation  errors)  is  an 
exact  representation  (one-to-one  mapping)  of  the  input,  but  may  not  be  a 
qood  approximation  of  the  continuous  transform  due  to  sampling  and  spectral 
estimation  considerations. 

Efficient  algorithms  for  computing  DFTs  are  known  as  Fast  Fourier 
Transforms  (FFTs).  These  algorithms  have  greatly  expanded  the  use  of 
Fourier  transforms  in  signal  processing,  and  permit  the  computation  of 
parameters  that  were  considered  impractical  a  short  while  ago. 


SPECTRAL  ESTIMATION  WITH  FFT  ALGORITHMS 

In  many  instances,  one  wants  to  estimate  the  power-density  spectrum 
of  the  process  which  generated  the  data  sample.  The  Fourier  Transform 
may  be  used  in  several  ways  to  obtain  such  an  estimate.  The  most  direct 
estimate  is  the  periodogram,  which  is  the  square  of  the  amplitude  of  the 
DFT,  i.e., 


inM  -  ^x(ej“>l2 

Unfortunately,  this  estimate  is  not  usually  a  good  use  of  the  available 
N  data  points.  The  periodogram  is  a  biased  estimate  of  the  spectrum,  with 
a  bias  which  decreases  with  N  but  a  variance  which  approaches  a  constant 
for  large  N.  For  a  Gaussian  spectrum,  the  variance  of  the  periodogram 
approaches  the  square  of  the  spectrum  which  results  in  rapid  fluctuations 
(in  the  periodogram  from  one  frequency  point  to  the  next)  about  the  true 
spectrum.  Since  the  resolution  in  frequency  also  increases  with  N,  there 
is  an  inevitable  trade-off  between  resolution  and  variance. 3 


3 

The  resolution  of  an  FFT  calculation  is  equal  to  the  bandwidth  of  interest 
divided  by  twice  the  number  of  data  samples  used  (since  two  data  points  are 
used  to  produce  amplitude  and  phase  information  at  each  frequency). 
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Two  common  ways  of  improving  the  periodogram  estimates  are  averaging 
and  smoothing.  The  averaging  method  breaks  the  data  block  into  several 
smaller  blocks,  usually  not  overlapping,  and  computes  a  periodogram  for 
each  smaller  block.  These  periodograms  are  then  averaged,  at  each 
frequency,  to  obtain  a  spectrum  estimate.  This  estimate  has  the  1ow 
resolution  and  large  bias  of  the  short  block  size,  with  a  reduced  variance 
from  the  averaging. ^  The  estimate  is  affected  by  any  data  window  used, 
of  course,  and  other  effects  dominate  for  very  short  data  blocks. 

An  alternate  technique  for  improving  spectral  estimates  is  to  "smooth" 
the  large-N  periodogram.  The  smoothers  used  correspond  to  window  types, 
and  the  remaining  errors  relate  to  window  shape.  The  same  types  of  trade¬ 
offs  exist  as  in  averaging,  and  smoothing  method  is  not  used  as  much  as 
averaging  now  that  FFTs  permit  fast  periodogram  calculations. 

These  techniques  assume,  of  course,  that  the  desired  spectrum  does 
not  change  during  the  data  interval  (N  steps,  above).  This  assumption  of 
stationarity  is  implicit  in  most  "ourier  Transform  analysis.  The  DFT 
will  always  be  a  unique  transform  of  a  given  data  set,  but  it  may  or  may 
not  be  the  spectrum  of  interest.  Several  complex  tests  for  stationarity 
exist  and  should  be  used  if  there  is  doubt  about  the  signal  characteris¬ 
tics.  A  change  in  the  measured  windowed  spectrum  gives  an  indication 
of  ncnstationarity,  of  course,  but,  in  general,  FFTs  provide  a  poor  means 
for  measuring  changing  spectra. 


SPECTRAL  ESTIMATION  VIA  ENTROPY  AND  LIKELIHOOD 

Maximum  Entropy  Method  and  Maximum  Likelihood  Method  (MEM  and  MLM) 
for  spectral  estimation  are  modern  procedures  (Childers  (24))  often 
considered  superior  to  various  alternatives.  The  value  of  these  methods 
lies  in  providinq  tools  which,  at  least  in  principle,  do  not  impose  strong 
assumptions  about  the  underlying  structure  of  the  process  being  examined. 
This  is  of  importance  since  other  parametric  tools  of  statistical  analysis 
require  checks  of  their  appropriateness,  of  which  practitioners  are 
often  not  aware.  Much  of  the  success  of  MEM  and  MLM  is  due  to  their  mini¬ 
mal  structural  assumptions,  and  hence  the  properties  of  the  data  come 
to  bear,  rather  than  a  possible  incorrectly  implemented  procedure.  There 
are  still  open  questions  in  this  area,  but  the  concepts  are  appealing 
and  turn  out  to  be  related  to  certain  forms  of  statistical  modeling, 
showing  the  latter  in  a  still  different  light. 


4 

A  bias  indicates  that  the  estimated  spectrum  converges  to  the  wrong 
spectrum,  independent  of  the  number  of  periodograms  averaged.  The  bias 
is  a  function  of  the  number  of  data  points  and  window  function  used  in 
each  periodogram.  The  degree  of  convergence  to  the  (biased)  spectrum 
estimate  ^s  indicated  by  the  variance,  which  decreases  with  the  number 
of  periodograms  in  the  average. 


The  Maximum  Entropy  Method  (MEM) 


The  Maximum  Entropy  Method  is  based  on  a  result  by  Bartlett  (9), 
who  showed  a  simple  relation  between  the  power-transfer  characteristics 
of  a  linear  filter  and  the  change  of  entropy  of  the  signal  transferred. 
8urg  (17)  then  raised  the  question  of  which  spectrum  estimate  has  the 
largest  entropy,  given  autocorrelation  values  of  a  signal.  The  idea  of 
maximizing  entropy  is  appealing  since  it  is  "least  committal"  (Abies  (2)), 
that  is,  few  prior  assumptions  have  to  be  made  about  the  data.  The  ques¬ 
tion  leads  to  a  constrained  optimization  problem.  The  entropy  change  is 
given  by  Bartlett's 


AH  =  /  In  S(v)  dv 
F 

and  the  constraints  resulting  from  the  (usually  estimated)  autocorrela¬ 
tion  values  satisfy 


g(k)  =  /  S(v)  exp[-i2irvk]  dv,  i  = 

F 

Interestingly,  the  Lagrangian  multipliers  in  this  problem  can  be  inter¬ 
preted  as  autoregressive  coefficients  (Childers  (23,  p.  92)),  identical 
to  those  described  in  Box  and  Jenkins  (14).  Burg  (17)  suggests  certain 
procedures  to  estimate  these  coefficients  in  a  recursive  way,  without 
resorting  to  estimates  of  the  autocorrelation  of  the  process,  as  is  done 
by  the  use  of  Yule-Walker  (120)  equation.  Thus,  the  importance  of  "end- 
effects"  of  the  finite  data  window  is  reduced.  For  example,  by  use  of 
the  Yule-Walker  equation,  there  is  finite  probability  of  obtaining  (for 
autoregressive  models  of  higher  than  second  order)  a  non-semi-positive 
definite  (Toeplitz)  autocorrelation  matrix  (regarding  the  definition  of  the 
autocorrelation  matrix,  see  Box  and  Jenkins  (14,  p.  31)).  Hence,  auto¬ 
regressive  coefficients  (or  Lagrangian  multipliers)  may  be  estimated  which 
correspond  to  an  explosive  process.  Clearly,  spectra  which  correspond 
to  such  a  process  are  meaningless,  since  limits  by  which  spectra  are 
defined  do  not  exist  under  such  circumstances.  The  problem  is  exacerbated 
when  observations  are  missing. 

Various  extensions  of  Burg's  (18)  MEM  are  in  use  today.  These 
extensions  concern,  for  example,  multidimensional  image  processes, 
processes  containing  white  noise,  and  vector  processes.  The  remaining 
main  problem  for  practical  applications  results  from  questions  about  the 
order  M  of  the  autoregressive  model  (or  more  generally,  the  number  of 
Lagrangian  multipliers)  which  should  be  used.  This  question  is  not  an¬ 
swered  by  the  current  maximum  entropy  methodology.  Some  guidance  is 
derived  from  classical  statistical  procedures  or  criteria  such  as  those 
given  by  Akaike  (5)  or  Schwarz  (98). 

The  importance  of  the  MEM  approach  lies  in  avoiding  the  imposition 
of  any  particular  structural  assumptions  on  the  spectral  estimation 
(except  for  linearity  when  Bartlett's  formula  is  used).  In  some  instances, 
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however,  certain  assumptions  about  a  spectrum  are  reasonable.  One  might 
expect  superior  estimation  if  these  prior  assumptions  are  incorporated 
into  Spectral  estimation.  This  idea  leads  to  the  method  of  maximum  like 
lihood  spectral  estimation. 


The  Maximum  Likelihood  Method  (MLM) 

The  maximum  likelihood  method  is  derived  from  a  structure  shown  in 
Figure  C-l.  The  goal  is  to  adjust  filter  coefficients  in  such  a  way  that 
the  single  frequency  of  the  signal  z(t)  is  optimally  (unbiased)  estimated 
by  the  filter  output  x(t).  Thus  the  filter  is  to  be  adjusted  in  such  a  way 
that  the  signal  z(t)  is  transferred  without  distortion,  and  all  other 
frequencies  are  suppressed  as  much  as  possible  (Lacoss,  (58)).  Obviously, 
the  procedure  has  certain  optimality  properties  when  a  single  frequency 
is  to  be  estimated.  However,  for  the  practitioner  it  is  Interesting  to 
see  performance  of  that  scheme  when  some  assumptions  are  not  satisfied— 
for  example,  when  the  spectrum  contains  two  frequencies,  possibly  "close" 
together.  It  turns  out  that  in  such  a  situation  the  "noncommittal" 

MEM  method  is  superior  In  detecting  two  spectral  lines  when  compared  to 
MLM.  Thus,  as  one  might  expect,  MLM  should  only  be  used  when  there  is 
strong  prior  evidence  for  the  existence  of  only  a  single  frequency  in  an 
otherwise  continuous  spectrum.  Burg  (19)  noted  a  simple  relation  be¬ 
tween  MEM  and  MLM  spectral  estimation  which  accounts  for  some  of  the 
properties  of  MLM  estimation,  for  example,  the  "smeared  out"  estimation  of 
a  pair  of  spectral  lines. 

When  two  or  more  spectral  lines  are  expected  in  an  otherwise  con¬ 
tinuous  spectrum,  one  might  of  course  generalize  the  MLM  approach.  The 
value  of  such  generalizations  can  be  seen  in  Siegel's  (102)  generalized 
test  of  periodicities  in  a  white  spectrum.  Even  though  Fisher's  test  is 
known  to  be  optimal  in  certain  settings  for  the  detection  of  a  single 
frequency,  it  is  outperformed  by  Siegel's  method  when  multiple  lines  are 
present  in  a  spectrum.  At  the  same  time  very  little  power  for  detecting 
single  lines,  compared  to  Fisher's  optimal  test,  is  lost. 


n(t) 


Figure  C-l.  Structure  for  maximum  likelihood  spectral  estimation  (MLM). 
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APPENDIX  D 


COMMUNICATION-THEORETIC  METHODS 


MODULATION 

Several  concepts  in  communication  theory  may  prove  to  be  useful  in 
understanding  or  characterizing  EEG  signals.  In  particular,  modulation 
and  hete'odyninq  may  be  used  to  transform  an  EEG  analysis  task  into  a 
domain  wnere  linear  signal  processing  is  more  appropriate.  Modulation, 
in  general,  is  the  encoding  of  a  signal  of  interest  in  another,  more 
easily  transmitted  signal.  The  process  is  designed  to  be  reversible 
so  that  the  original  siqnal  may  be  retrieved  (demodulated)  by  the  receiver. 
An  excellent  reference  for  communication  systems  is  Wozencraft  and 
Jacobs  (119).  ^ 

The  most  common  forms  of  modulation--amDlitude  and  frequency  modula¬ 
tion  (AM  and  FM)--are  achieved  by  the  multiplication  of  a  sinusoidal 
(carrier)  signal  s(t)  by  an  information  process  of  interest  a(t).  The 
new  signal  (usually  thought  of  as  a  "transmitted"  signal)  z(t)  is  then 

z(t)  =  a (t)s(t) 

or  a  linearly  filtered  version  of  the  above. 

Amplitude  modulation  is  produced  by  the  operation 
z(t)  =  A(t)  sinwt 

where  A(t)  is  the  process  of  interest  and  w  is  known,  while  frequency 
modulation  (or,  more  generally,  angle  modulation)  may  be  written 

z(t)  =  A  sin(wt  +  0(t)) 

Where  0 ( t)  contains  information  and  A  is  a  fixed  amplitude. 

Although  we  are  not  interested  in  communication  systems  in  general, 
it  is  useful  to  examine  typical  transmitted  signal  types  and  the  means  of 
demodulating  them.  Demodulation  is  a  transformation  to  recover  a  signal 
of  interest  from  the  transmitted  wave.  In  many  instances  of  signal  process¬ 
ing  ,  it  is  useful  to  transform  a  signal  (whether  originally  modulated  or 
not)  to  simplify  subsequent  processing.  In  addition,  demodulation  may  be 
used  to  extract  certain  information  (e.g.,  phase  coherency)  of  interest 
even  though  the  corresponding  modulation  process  is  not  thought  to  be 
Dresent. 

The  basic  goal  of  most  demodulation  systems  is  to  mathematically 
recover  the  signal  (i.e.,  functionally  invert  the  modulation)  while  re¬ 
moving  as  much  transmission  noise  as  possible.  The  noise  may  be  wide¬ 
band  (nearly  white)  due  to  receiver  thermal  noise  or  very  narrowband 
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due  to  specific  interference  or  jamminq.  For  EEG  analysis,  noise  sources 
include  thermal  noise  from  the  electrodes  and  amplifiers,  60  Hz  "line" 
noise  from  the  power  system,  and  possibly  a  related  120  Hz  noise  from 
fluorescent  lights.  For  VER  detection,  the  spectrally  similar  backgrourc 
EEG  itself  may  be  considered  "noise." 


Heterodyning 

Heterodyning  is  a  modulation  process  which  frequency-shifts  the 
signal  of  interest  to  facilitate  transmission  or  processing.  The  tech¬ 
nique  relies  on  standard  trigonometric  identities  and  high,  low,  or  band¬ 
pass  filtering  to  manipulate  the  signal  spectrum  in  the  frequency  domain. 

As  an  example,  consider  a  signal  of  bandwidth  W,  ar  shown  below: 


•S(f) 

-W  U 

where  S(f)  is  the  2-sided  power  spectrum. 

If  this  signal  is  multiplied  by  2  costu^t,  the  new  product  has  spectrum: 


1 _ . 

S'(f) 

. 

□ 

_ J 

1  1 

-  -1 

-f]  frW  f1  fj+W 


u>i 

where  f^^  .  The  information  in  the  range  f-|-W  to  f-j  is  redundant  and 

sometimes  filtered  out  (in  single  side-band  modulation)  to  oroduce  the 
spectrum: 


SSSB^ 


f  j+W 


If  this  signal  is  transmitted,  received,  and  multiplied  by  2  coS'i,.t  ago  in, 
the  spectrum  below  results: 


S"(f) 


I 


W 


2f 


2f]  +W 
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This  signal  may  now  be  low-pass  filtered  to  remove  the  component  above 
2f i  and  recover  the  original  information. 

v  I 

This  tyoe  of  manipulation,  while  invaluable  in  communications,  must 
be  aoproached  with  caution  in  siqnal  analysis  in  general,  where  no  control 
may  exist  over  the  original  modulation.  For  example,  if  a  narrowband 
siqnal  exists  as  shown: 


)L 


and  the  siqnal  is  multiplied  by  2  cosa^t  where 
f2  =  =  fi  "  A,  A<W 

and  then  low-pass  filtered  to  remove  the  f.+f?-A  component,  the  distorted 
soectrum  shown  below  results: 


When  "demodulating"  a  narrowband  signal  whose  modulation  is  un¬ 
certain,  one  must  be  careful  to  preserve  the  information  in  the  signal. 
One  way  to  assure  this  preservation  is  to  carefully  bandpass  filter  the 
siqnals  before  heterodyning,  to  assure  that  no  unanticipated  aliasing,  or 
frequency  ambiguity  (as  shown  above),  occurs. 

One  way  to  use  such  distortion  to  advantage  is  in  power  spectrum 
estimation.  For  examole,  if  we  multiply  a  measured  EEG  by  a  cosine  wave 
at  10  Hz  and  oass  the  result  through  a  low-pass  filter  with  a  bandwidth 
of  1  Hz,  the  resulting  siqnal  is  a  mixture  of  the  original  signal  be¬ 
tween  9  and  11  Hz.  By  rectifying  and  averaging  (low-pass  filtering)  this 
signal,  an  estimate  of  the  power  in  the  original  signal,  between  9  and 
11  Hz,  may  be  obtained.  This  estimator  has  independent  control  ov  ••  the 
bandwidth  (the  first  low-pass  filter  after  heterodyning)  and  response 
(the  averaqer  low-pass  filter)  of  the  spectral  estimate,  subject  to  the 
restriction  that  the  averager  should  not  be  faster  (higher  bandwidth) 
than  the  first  filter. 


PHASE-LOCK  LOOPS 

Phase-lock  Ioods  are  communications  receivers  that  lock-on,  or 
synchronize,  to  the  transmitted  signal,  thus  permitting  excellent  perform¬ 
ance  even  during  transmitter  or  receiver  drift  in  frequency.  These 


devices  naturally  demodulate  phase-  or  frequency-modulated  signals,  while 
being  almost  completely  insensitive  to  amplitude  modulation--which  may 
be  due  to  interference  in  normal  communication  channels. 

Usually,  phase-lock  loops  assume  a  signal  input  of  the  form: 
z(t)  =  A  sin(ut  +  e(t) ) 

where  A  and  w  are  known  and  o(t)  is  the  information  process  of  interest. 

If  z(t)  is  multiplied  by  a  siqnal  of  the  form 

2  cos  (ait  +  e{t)) 

and  the  product  is  then  low-pass  filtered  to  remove  the  double-frequency 
component  (A  sin(2wt  +0  +0))%  the  result  is 

A  sin(9  -  0) 

If  0  is  "close"  to  e,  this  signal  may  be  approximated  by 
A(e  -  0) 

and  a  linear  filter  then  used  (in  servo  fashion)  to  compute  §,  an  estimate 
of  0,  from  the  error  signal  0-e.  The  loopjs  (usually)  completed  by  a 
voltage-controlled  oscillator,  which  takes  0  as  an  input  and  produces 
2  cos (u)t  +  0)  as  an  output.  The  loop  thus  looks  like  Figure  D-l,  and  per¬ 
forms  like  the  linear  system  of  Figure  D-2. 


Figure  D-2.  Linearized  PLL  model. 

^  We  temporarily  suppress  the  time  dependence  of  u and  6  for  convenience. 

6  LP  bv  the  multiplier  refers  to  the  low-pass  filtering  which  removes  the 
dcubl e-frequency  terms. 
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Several  assumptions  are  implicit  in  the  PLL  model.  The  frequency 
and  amplitude  are  assumed  known,  but  this  may  be  relaxed  somewhat.  For 
example,  the  filter  may  be  designed  to  acquire  and  lock  to  a  frequency 
different  from  that  expected  (within  limits,  of  course).  Also,  practical 
considerations  usually  require  that  a  hard  limiter  be  used  on  the  input 
(usually  in  the  multiplier).  Thus,  the  received  amplitude  may  be  an 
unknown  A  but  the  input  amplitude  to  the  PLL  will  be  the  known  linr'ter 
amplitude  (L),  provided  that 

A  >  L 

If  A  <  L  and  A  is  an  unknown  function  of  time,  then  A  can  significantly 
degrade  loop  performance  or  make  the  loop  completely  lose  lock.  The 
deqree  to  which  the  original  signal  can  be  amplified  to  make  A  >  Lis 
limited  by  the  receiver  noise. 

Phase-lock  loops  have  been  suggested  for  use  of  EEGs  to  track  an 
expected  phase  modulation.  This  model  assumes  that  the  single-sideband 
type  of  phase  modulation  (shown  above)  is  in  fact  present,  and  that 
double-sided  modulation,  i.e., 

z(t)  =  A  sin(wt  +  0)  +  A  sin  (tut  -  0) 

is  not.  It  also  requires  that  A  be  known  or  limited  (as  above).  It  is 
generally  prudent  to  narrowband  filter  the  input  before  the  PLL,  to  assure 
that  no  out-of-band  components  corrupt  the  processing.  The  narrowband 
filter  should  have  at  least  the  bandwidth  of  the  loop  filter  (calculated 
from  the  linear  model)  and  be  centered  around  the  band  of  interest. 

These  bandwidth  considerations  become  very  important  when  the  PLL 
filter  bandwidth  is  a  large  fraction  of  the  carrier  frequency,  for  example, 
for  low  frequency  EEGs.  For  instance,  if  a  PLL  is  tracking  an  EEG 
component  at  8  Hz,  and  the  loop  filter  is  1  Hz  wide,  then  the  loop  will 
be  susceptible  to  other  EEG  components  present  between  7  and  9  Hz.  Un¬ 
less  the  EEG  is  known  to  represent  only  one  ohase-modulated  process,  it 
is  dangerous  to  use  a  wideband  PLL  to  track  the  "EEG  phase." 

One  final  aspect  of  PLLs  is  the  inherent  lock  indication  avail¬ 
able.  By  multiplying  the  original  limited  input  by  2  sin(wt  +  0)  and  low- 
pass  filtering,  the  signal  L  cos(o-6)  is  obtained.  An  averaged  value  of 
this  signal  provides  an  indication  of  lock  since,  for  a  -  0  small,  the 
cosine-*- 1.  A  threshold  detector  easily  may  be  set  to  trigger  on  this 
signal,  often  called  the  "quadrature"  channel  since  it  results  from  a 
signal  90°  out-of-Dhase  to  the  original  feedback  channel. 

In  general,  if  e(t)  can  be  described  as  a  linear  process  and  if  the 
acutal  measurement 

z(t)  =  A  si n (out  +  0 ( t ) )  +  n(t) 

has  low  enough  noise  n(t),  then  a  phase-lock  loop  will  perform  very  well. 
For  cases  where  more  noise  is  present,  other  filters  may  be  developed 
(see,  e.g..  Bucy  and  Mallinckrodt  (16),  Will  sky  (117),  Gustafson  and 
Speyer  (44)).  For  the  EEG,  measurement  noise  is  not  as  much  of  a  prob¬ 
lem  as  amplitude  modulation,  and  two  simple  approaches  to  phase  and 
amplitude  demodulation  are  worth  noting. 
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The  first  technique  uses  linear  filters  in  the  measurement  space, 
i.e.,  filters  which  take 

x(t)  =  A{t)  sin(oit  +  e) 

or  the  "base  band"  (heterodyned  by  2  sinwt  and  2  cosiot  and  low-pass 
filtered)  pair 

x^{t)  =  A(t)  sine 

X2 ( t)  =  A{t)  cose 

as  signals  of  interest  (states).  These  filters  assume  that  the  measure¬ 
ment  is  of  the  form 

z(t)  =  x(t)  +  n(t) 

where  n(t)  is  white  Gaussian  noise,  and  then 
Z-|  (t)  =  x-j  (t)  +  n^t) 
z2(t)  =  x2(t)  +  n2(t) 

are  the  inputs  fed  to  linear  filters,  where  n,  and  n2  become  (after  hetero¬ 
dyning)  independent  white  Gaussian  noise  processes.  ^The  linear  filters 
produce  estimates  x^t)  and  x2(t),  and  phase  and  amplitude  estimates  are 
constructed  by 

6  =  tan"1  x1(t)/x2(t) 

A(t)  =■  (xf(t)  +  x|{t))% 

An  alternate  technique  for  amplitude  estimation  is  to  use  the  quad¬ 
rature  charnel  of  a  PLL  discussed  above.  If  a  limiter  is  used  for  the 
normal  PLL,  and  A(t)  >  L,  then  the  normal  loop  will  be  insensitive  to 
A(t),  and  a  good  phase  estimate  will  be  obtained.  Also,  the  quadrature 
signal 


2  cos (wt  +  e(t)) 

may  be  used  to  heterodyne  the  input  (without  limiting)  to  create 
A ( t )  cos(e-e) 

If  0  is  close  to  0,  this  signal  produces  a  good  amplitude  estimate. 

It  should  be  noted  that  all  of  these  demodulation  techniques  assume 
an  original  modulation.  If  the  amplitude  or  phase  of  an  EEG  is  easier 
to  track,  classify,  or  reproduce  than  the  measured  wave,  then  such  de¬ 
modulation  will  be  justified.  If,  however,  the  demodulation  reveals  no 
new  insight,  it  may  be  a  needless  complication,  and  other  techniques  should 
be  considered. 
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APPENDIX  E 


NONADAPTIVE  TIME-DOMAIN  ANALYSIS 

Time  Domain  Analysis  is  a  tool  to  compress  data  efficiently,  that 
is,  with  little  loss  of  information  and  if  possible  by  a  simple  scheme. 

In  some  sense  the  compression  then  describes  the  properties  of  the  data. 

The  compressed  information  may  be  used  to  forecast,  classify  patterns, 
or  for  design  changes  (e.g.,  when  a  system  appears  "sluggish"). 

For  the  purpose  of  VER/EEG  analysis  we  will  concentrate  on  the 
stochastic  modeling  of  time  processes.  An  important  class  of  these  models 
is  given  by  the  Markov  processes.  In  these  processes  the  future  statistics 
of  a  process  are  fully  specified  from  knowledge  of  the  present  statistics. 
This  concept  is  also  referred  to  as  a  generalized  causality  principle. 

In  good  part  the  importance  of  these  processes  arises  from  their  flexibility 
and  mathematical  convenience.  The  transition  fram  the  present  to  the 
future  may  proceed  in  a  linear  or  nonlinear  fashion  and  the  process  may 
or  may  not  be  stationary. 

An  important  class  of  linear  and  stationary  models  is  given  by  auto¬ 
regressive  ( AR ) ,  moving  average  (MA),  and  autoregressive-moving  average 
(ARMA)  processes.  Their  importance  in  a  variety  of  fields  and  a  system¬ 
atic  approach  to  selecting  proper  models  is  described  by  Box  and  Jenkins 
(14).  An  important  and  still  more  general  tool  was  developed  by  Kalman 
(55).  Increased  generality  and  applicability  are  due  to  considering 
nonstationary  linear  processes  which  also  allow  approximation  of  many 
nonlinear  processes  through  nonstationarity. 

In  this  appendix  we  outline  model  assumption  for  AR,  MA,  and  ARMA 
structures  together  with  their  characteristics.  Then  we  outline  the  Kalman 
model  and  discuss  its  advantages  over  other  procedures,  but  also  some 
important  problems  related  to  its  structural  generality. 


AUTOREGRESSIVE  AND  MOVING  AVERAGE  MODELS 

The  use  of  AR  models  goes  back  to  Yule  (120)  when  he  attempted  to 
predict  sunspot  activity.  It  had  been  observed  from  Wolfer's  (Box  and 
Jenkins  (14))  sunspot  data  that  the  sunspot  activity  was  nearly  periodic 
with  a  cycle  length  of  about  11  years.  However,  there  was  fluctuation  in 
amplitude  and  period  of  these  numbers.  Yule  (120)  attempted  to  describe 
these  fluctuations  by  a  causal  random  process  of  the  type: 

x ( t )  =  <*1  x(t-l)  +  «2x(t~2)  +  ...  +  x(t-p)  +  c(t) 

where  e(t)  expresses  random  shocks  driving  the  linear  process.  They 
assumed  c(t)  to  be  a  white  process  with  constant  power;  that  is,  the 
covariance  of  e(k)  and  e(0  are  given  by 

cov  [e ( k) ,  cU)]  =  <$k£  °2. 
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Yule  found  for  such  a  process  a  closed  form  estimate  for 

a-j,  a  and  o2  based  on  estimates  of  the  autocorrelation  function. 

Asymptotically,  as  the  observed  data  grows  infinitely  long,  his  estimate 
is  equivalent  to  the  maximum  likelihood  estimate.  Observe  that 
for  an  AR  process  any  disturbance  propagates  infinitely  long  (even  though 
the  amplitude  may  decrease). 

An  alternative  linear  process  with  slightly  different  characteristics 
can  be  written  as 

y(t)  =  e(t)  -  $i  e(t-l)  -  ...3  c(t-q) 

2 

COV  [e ( k ) ,  e(fc)]  =  6^  o 

This  linear  process  is  known  as  moving  average  (MA)  process.  An  apparent 
difference  to  the  AR  process  is  its  finite  memory  of  lag  q;  that  is,  after 
more  than  q  steps  any  disturbance  has  died  out. 

For  purposes  of  modeling  it  appears  attractive  to  combine  the  AR 
with  MA  structure  to  obtain  a  still  more  flexible  model.  The  combination 
may  be  written  as 

y(t)  =  a-j  y(t  ...  -  ap  y(t-p)  =  e(t)  -  $-|  e(t-l)  -  ...  -3q  e(t-q) 

2 

COV  [e(k) ,  eU j]  =  . 

and  is  called  an  autoregressive-moving  average  (ARMA)  process.  Since 
this  model  may  be  viewed  as  a  polynomial,  in  a  delay  operator,  on  y(t)  and 
e(t),  the  representation  is  only  unique  up  to  common  roots  in  these  poly¬ 
nomials  (known  to  engineers  as  pole-zero  cancellation).  This  is  of 
importance  when  parameters  are  to  be  estimated  since  such  common  roots 
are  not  identifiable. 

As  an  example  for  an  ARMA  process,  Zetterberg  and  Kjell  (121) 
have  modeled  the  EEG  signal  as: 

m  n 

y(k)  =  E  a.  y(k-i)  +  E  b.  e(k-i)  +  e(k)  (E.l) 

i=l  1  j=l  1 

where  y(k)  is  the  EEG  signal  at  time  tk  =  kT,  and  e(k)  is  an  assumed 
white  noise  input  process.  This  ARMA  model  parametrizes  the  EEG  in  a 
set  of  m+n  parameters  (aT,  ...»  a^;  b-j ,  - ,bn).  The  residual  process 

e(k)  (the  modeling  error)  can  be  computed  to  determine  goodness-of-fit, 
for  example,  by  variance  tests.  The  advantage  of  a  model  of  the  form  of 
(E.l)  is  that  it  may  be  possible  to  model  the  EEG  signal  using  a  few 
parameters.  Zetterberg  found  that  m  <  5,  n<m,  gave  satisfactory  per¬ 
formance  in  most  cases  for  the  spontaneous  EEG.  Bohlin  (12)  has  also 
considered  models  of  this  form.  Since  many  parameters  are  free  to  choose, 
a  significant  computational  simplification  is  accomplished  by  assuming 
the  moving  average  parameters  (b, ,  ...,  b  )  to  be  constant  for  all  signals, 
so  that  signal  variations  could  be  accounted  for  by  using  only  the  auto¬ 
regressive  coefficients  (a,,  ...,  a  ).  For  the  purpose  of  multilead 
recording  the  modeling  can  easily  be  extended  to  vector  ARMA  processes  and 
covers  then  all  linear  stationary  finite  dimensional  Markov  processes.. 
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The  basic  theory  using  (E.1)  assumes  stationarity  of  the  EE6  signals. 
However,  it  is  well  known  that  the  character  of  the  EEG  may  ch-nge  spon¬ 
taneously.  Furthermore,  when  the  chanqe  in  character  is  induced  by 
stimuli,  the  nature  of  the  change  is  of  interest. 

In  order  to  treat  this  more  general  problem  effectively  a  different 
notation,  known  as  the  Kalman  message  model  (Kalman  (55))  was  introduced. 

The  model  with  the  observation  ;z(t)  is  given  by 

z(t)  =  Ht  x(t)  +  v(t)  observation  model 

x(t)  =  Ft  x(t-l)  +  w(t)  process  model 

C0¥<v  >  ■  6kt  Vk>' 

C0V<V  )  =  skt  Vk> 

«>  moment  assumptions 

cov(v^,  )  =0 

E[w( t )]  =  E[v(t)]  =  0  > 

Kalman  (55)  showed  a  computationally  efficient  way  to  track  in 

real  time  x(t).  The  algorithm,  in  combination  with  nonlinear  parameter 

estimation  is  easily  extended  to  find  also  estimates  of  the  transition 

matrix  F. ,  observation  matrix  H^,  the  variance  of  the  measurement  noise 

Vw(t),  ana  the  process  noise  V1((t).  Moderately  nonlinear  processes  may 
V  w 

be  approximated  by  nonstationarities  (Jazwinski  (53)).  The  scheme  is  also 
easily  extended  to  be  adaptive  (see  Appends  G,  "Adaptive  Filtering")  and 
robust  (see  Appendix  G,  "Artifact  Detection  and  Robustness"). 

Clearly,  with  increasing  generality  uf  schemes,  theoretical  and 
computational  problems  arise.  For  example,  while  for  the  estimation  of 
AR-parameters  the  solution  is  unique  and  found  by  elementary  matrix 
operations,  estimating  parameters  in  the  Kalman  model  must  be  preceded 
with  an  analysis  whether  these  are  at  all  observable  (such  as  in  the 
problem:  what  is  a  given  c,  where  c  =  a  +  b?).  In  other  words,  parameters 
are  mutually  dependent,  sometimes  such  that  they  cannot  be  distinguished. 
(Note:  this  observability  problem  arose  already  in  ARMA  models  with  the 
pole-zero  cancellation). 

When  estimation  of  parameters  is  possible,  one  still  has  to  be 
aware  of  possible  nonuniqueness,  a  result  of  the  complex  (usually  non¬ 
linear)  relation  of  parameter  with  the  data.  Hence,  when  nonlinear 
parameter  estimation  is  used,  one  has  not  only  to  recursively  optimize 
the  fit  of  a  model,  but  also  to  verify  uniqueness  by  using  a  sufficient 
number  of  different  starting  points  of  the  estimation  scheme.  Since, 
in  addition,  many  model  structures  are  possible,  they  all  have  to  be 
compared. 

2 

S  I  is  in  possession  of  routines  which  perform  this  task  automatically, 
but  application  to  models  with  many  parameters  (high  dimensionality)  is 
computationally  expensive  and  requires  potentially  high  computational 
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accuracy.  S  I  is  also  working  on  methods  to  relax  some  of  the  compu¬ 
tational  requirements,  further  automate  these  routines,  and  improve  their 
performance  for  high  dimensional  problems.  It  will  not  be  advisable, 
however,  to  use  these  routines  in  a  completely  unsupervised  manner. 

One  of  the  conveniences  in  the  use  of  the  Kalman  filter  lies  in  the 
simplicity  with  which  physical  models  are  converted  into  filter  coefficients. 
For  example,  one  may  have  good  a  priori  knowledge  of  the  noise  power 
in  measurements;  this  quantity  can  be  directly  entered  in  the  Kalman 
model.  If  this  knowledge  is  to  be  expressed  in  the  ARMA  structure,  a 
nonlinear  relation  to  ARMA  parameters  arises.  Clearly,  this  further 
exacerbates  the  problem  of  untangling  the  "components"  in  biological 
signals. 

A  danger  in  the  use  of  the  Kalman  filter  for  the  layman  lies  in  the 
rather  overwhelming  freedom  of  structures  he  may  choose  from.  He  is  then 
easily  tempted  to  "overfit"  the  data  (Box  and  Jenkins  (14)).  Thus  their 
use  requires  experience  with  modeling,  beginning  with  statistical  analysis 
of  residuals  to  an  understanding  of  controllability  and  observability 
and  apDreciation  of  numerical  complexities. 
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APPENDIX  F 


NONLINEAR  SYSTEMS  ANALYSIS 

The  nonlinearity  of  responses  of  living  systems  to  stimuli,  such 
as  VERs  for  visual  stimulation,  is  well  documented.  Nonlinearity  is  also 
seen  on  more  microscopic  levels--for  example,  in  the  generation  and 
propagation  of  action  potentials  or  the  quantum  release  of  chemical 
transmitter  substances  of  synapses.  The  importance  of  nonlinearity  can 
be  appreciated  in  different  ways:  on  the  one  hand,  it  is  a  means  to  enrich 
the  input-output  relations  of  systems  and  often  it  is  a  tool  to  perform 
certain  tanks  very  reliably  and  inexpensively.  In  some  instances,  it  is 
indeed  the  optimal  approach  to  certain  constrained  problems  such  as  the 
force  constraint  in  saccadic  eye  movements.  On  the  other  hand,  the 
richnpss  of  input-output  relations  poses  considerable  analytical  problems; 
hence,  analysis  is  typically  limited  to  approximations.  These  approximations 
will  often  require  simulations  and  iterative  numerical  procedures  to  check 
and  improve  solutions. 

For  the  understanding  of  such  systems  special  tools  were  developed 
in  the  engineering  sciences.  Typically  the  tools  are  based  on  an 
expansion  such  as  Taylor  or  Volterra  series.  Special  forms  of  linearization 
are  used  for  the  describing  function  analysis  or  extensions  to  the  Kalman 
filter  algorithm.  Applications  of  Volterra  series  expansions  for  system 
analysis  were  pioneered  by  Wiener  (116)  and  subsequently  somewhat  modified 
by  others  (Marmarelis  (67)). 

We  may  start  out  with  the  describing  function  approach,  mainly 
because  it  provides  insight  into  some  of  the  important  properties  of 
nonlinear  systems.  As  such,  it  is  mainly  developed  for  the  evaluation 
of  known  nonlinear  structures,  such  as  the  description  of  transfer 
characteristics  or  oscillations,  as  opposed  to  the  estimation  of  under¬ 
lying  structure,  given  the  transfer  characteristics.  On  the  contrary, 
extensive  Kalman  filter  methods  and  Volterra  series  representations 
have  been  developed  for  the  estimation  and  identification  of  unknown 
structures  given  input-output  relations.  However,  in  this  appendix  we 
will  not  discuss  the  Kalman  filter  algorithms  or  its  extensions,  since 
it  is  based  on  linear  estimation  and  only  its  extensions  deal  with 
nonlinearity.  The  reader  is  referred  to  Appendix  G  where  approximations 
to  nonlinearities  and  nonstationarities  are  treated  jointly. 


DESCRIBING  FUNCTIONS  ANALYSIS 

Describing  function  analysis  was  mainly  developed  in  the  engineering 
sciences  and  resulted  from  the  need  to  describe  nonlinear  devices  such 
as  on-off  controllers  in  an  analytical  fashion  jointly  with  other  possible 
linear  circuitry.  Often  these  on-off  controllers  are  implemented  more 
reliably  or  economically  than  continuous  alternatives.  It  should  thus 
not  be  surprising  that  similar  principles  are  also  favored  in  biological 
systems.  As  a  matter  of  fact,  in  many  hormonal  control  schemes  (Martin  (68)) 
the  on-off  approach  is  rather  "popular.1.  However,  the  usual  context  of 
describing  function  is  with  dynamical  mechanical  or  electrical  elements 
from  engineering.  In  that  field,  specification  of  elements  and  overall 
structure  are  known  and  performance  characteristics  are  of  interest. 
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For  our  purpose  here,  the  situation  is  somewhat  different.  Nevertheless, 
we  believe  the  describinq  function  analysis  provides  a  tool  to  gain  insight 
into  processes  such  as  limit  cycles,  sub-  and  superharmonics,  and  intermodulation. 
It  may  also  provide  guidance  in  the  design  of  stimuli  and  aid  in  the 
discrimination  of  competing  hypotheses. 

The  main  thrust  of  describing  function  arises  from  the  convenience 
of  describing  a  nonlinearity  by  its  transfer  of  the  fundamental  frequency 
and/or  the  transfer  of  mean  and  variance  of  the  input  signal.  The 
sufficiency  of  such  a  description  arises  from  the  memory-behavior  of 
practical  linear  components  within  a  system.  Firstly,  the  memory  will 
result  in  spectral  selectivity  and  secondly,  via  the  central  limit  theorem, 
it  sugqests  Gaussian  output  amplitude  densities  (Wozencraft  and  Jacobs  (119)). 
Hence  the  signal  flow  and  characterization  may  be  accomplished  by  the 
consideration  of  only  few  soectral  lines  and  the  propagation  of  only 
the  first  two  moments  of  the  amplitude  distribution  functions  (which  are 
given  by  the  mean  and  variance). 

The  methodology  of  the  describing  function  analysis  provides  thus 
moderately  simple  means  to  understand  oscillations  of  biological  systems 
(recall  the  oscillation  of  the  focal  length  of  the  lens  of  the  eye),  and  to 
predict  their  frequencies  and  amplitudes  as  well  as  sensitivity  to  external 
perturbations.  Mechanisms  like  variable  gain  and  the  effect  of  dither 
signals  (signals  which  have  a  linearizing  effect  on  nonlinearities)  can 
all  be  studied  within  that  framework.  For  hypothesized  structures  the 
methodology  may  suggest  signals  which  emphasize  a  particular  feature  of 
a  system,  or  guide  one  to  test  signals  which  allow  improved  estimation 
of  certain  components,  or  possibly  to  discriminate  between  alternatives. 

The'-e  Is  considerable  freedom  in  the  design  and  modification  of  signals 
because  of  their  freedom  in  space  and  time.  Some  guidance  about  the 
way  in  which  such  changes  should  be  made  appears  very  important  and 
may  in  part  be  answered  by  that  methodology. 


VOLTERRA  SERIES 

The  Vol terra  series  representation  for  the  analysis  of  nonlinear 
dynamical  networks  was  first  developed  by  Wiener  (116).  The  concept 
evolves  easily  from  the  generalization  of  the  impulse  response  charac¬ 
terization  of  linear  networks,  an  approach  widely  used  in  engineering. 

In  linear  networks  (assume  for  simplicity  scalar  input  and  output) 
the  general  response  (the  output)  is  completely  characterized  by  the 
resDonse  to  a  unit  Dirac  impulse.  Basically  any  input  other  than  the 
Dirac  impulse  may  be  viewed  as  a  limiting  superposition  of  infinitely 
many  impulses  shifted  in  time.  Due  to  the  superposition  principle  of 
linear  networks  to  the  response  to  this  arbitrary  input,  waveform  is  also 
the  limiting  superposition  of  the  individual  impulse  responses,  since 
no  "interaction"  takes  place  between  impulse  responses. 

The  generalization  to  nonlinear  system  characterization  is  then 
realized  by  incorporating  the  possibility  of  interaction  of  impulse 
responses.  This  concept  is  precisely  what  is  expressed  by  the  Wiener 
series.  To  simplify  the  characterization,  Wiener  chose  orthogonal 


94 


functionals.  In  general,  an  infinite  number  of  functionals  (and  their 
associated  kernels)  have  to  be  considered.  In  practice,  this  infinite 
expansion  must  be  truncated;  but  from  current  methodology  (Lee  and  Schetzen, 
(62))  it  is  not  clear  how  many  kernels  have  to  be  calculated  for  adequate 
system  characterization.  For  rather  practical  computational  reasons 
only  first-  and  second-order  kernels  are  usually  calculated. 

In  the  application  of  the  approach  other  considerations  are  also 
of  importance.  Instead  of  a  Gaussian  white  input  signal  (which  carries, 
roughly  speaking,  infinitely  many  (Dirac)  impulses  with  a  particular 
amplitude  distribution)  some  other  amplitude  distribution  and  nonwhite 
signal  which  can  be  generated  by  physical  means  (finite  power)  must 
be  used.  Under  limiting  conditions  a  Gaussian  white  signal  is  usually 
reached. 

Marmarelis  (67)  points  also  to  other  practical  limitations  of 
the  method.  In  particular,  he  notes  the  strong  dependence  of  kernel 
values  on  input  (stimulus)  power.  Great  experimental  care  has  to  be 
taken  since  the  dependency  becomes  more  prrnounced  with  increasing 
order  of  the  kernel.  He  discusses  a  variety  of  practical  considerations 
and  suggests  methods  for  computing  error  bounds  on  the  performance  of 
such  analysis.  Some  theoretical  difficulties  associated  with  kernel 
computation  in  dependence  on  input  function  are  also  presented.  A 
good  example  of  the  application  of  the  method  to  quantification  of 
multiple  sclerosis  is  shown  in  Sclabassi  et  al .  (99).  In  this  example,  the 
intuitive  meaning  of  the  second-order  kernel  as  a  measure  of  interaction 
between  a  pair  of  stimuli  Is  quite  appealing  to  the  clinician,  for 
the  evaluation  of  the  integrity  of  portions  of  the  nervous  system 
is  certainly  characterized  by  such  interactions. 
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APPENDIX  G 


ADAPTIVE  AND  ROBUST  SCHEMES 


ADAPTIVE  NOISE  CANCELLING 

One  of  the  principal  objectives  in  the  EEG  signal -processing  task 
under  consideration  is  the  elimination  of  spurious  signals  (noise)  from 
the  desired  VER.  The  problem  is  complicated  by  the  fact  that  there  are 
no  simple  methods  for  modeling  the  nature  of  the  noise  or  the  inter¬ 
relationships  of  the  signals  present  at  the  various  scalp  electrode  loca¬ 
tions.  In  this  case,  we  need  to  be  very  careful  in  any  modeling  assump¬ 
tions  we  make  to  ensure  that  they  do  not  place  unnecessarily  severe 
limitations  o*  the  quality  of  the  results;  that  is,  we  seek  techniques 
that  are  robust  to  modeling  errors.  In  a  general  sense,  we  can  enhance 
robustness  by  using  a  minimum  number  of  assumptions  in  our  models  and  by 
using  a  minimal  number  of  parameters  as  well. 

One  technique  which  uses  a  very  minimal  number  of  assumptions  as  to 
the  nature  of  the  data  Is  the  adaptive  noise-cancelling  technique  cf 
Wldrow  et  al .  (115).  The  form  of  the  problem  and  Its  solution  are  shown 
in  Figure  G-l  using,  for  simplicity  in  presentation,  a  single  signal  channel 
and  a  single  noise  channel.  In  the  figure,  the  signal  S  is  corrupted  by 
noise  nQ.  Two  electrodes  are  used;  the  first  electrode  records  signal 
plus  noise  (S  +  ng)  and  the  second  electrode  records  the  noise  n^ .  nQ  and 

n,  are  related  by  an  unknown  transformation  which  is  dependent  upon  the 

properties  of  the  medium  through  which  the  noise  travels.  Clearly  if 

ng  =  n-j,  the  signal  can  be  recovered  directly  by  subtraction  (S  =  S  +  ng  -  n^) 


I _ 

Adaptive  Noise  Canceller 


Figure  G-l.  Adaptive  noise-cancelling  concept. 
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Thus,  we  see  that  the  problem  of  recovery  of  the  signal  is  related  to  the 
properties  of  the  conducting  medium  and,  it  turns  out,  the  properties  of 
the  noise  itself.  Since  the  spurious  signals  we  wish  to  estimate  may 
change  dramatically  in  character  with  time,  we  need  a  method  to  adapt 
to  thesfe  unpredictable  chanqes.  The  Adaptive  Noise  Cancelling  method 
is  an  appropriate  technique  for  this  problem. 

By  inspection  of  Figure  G-l  it  is  clear  that  we  wish  to  make  y  "track" 
the  unknown  noise  nQ  as  closely  as  possible.  If  we  make  the  crucial 
assumption  that  theusignal  S  is  uncorrelated  with  nQ  and  y,  then  the 
problem  is  solved  by  adjusting  y  to  give  minimum  output  power  (the  power 
in  z). 

The  adaptive  noise  canceller  may  be  used  in  various  ways.  By  assum¬ 
ing  that  the  desired  signal  is  the  VER,  we  can  attempt  to  cancel  distur¬ 
bances  and  make  z  a  good  estimate  of  the  VER.  Another  approach  would 
be  to  treat  a  desired  response  (such  as  flash  response)  as  the  signal  and 
make  z  a  good  estimate  of  the  response.  In  this  case,  the  noise  would 
include  the  VER. 

The  adaptive  noise  canceller  has  found  many  applications,  as  a  result 
of  its  very  general  nature.  These  include  estimation  of  fetal  ECG  by 
elimination  of  maternal  ECG,  elimination  of  radar  sidelobes,  notch  filter¬ 
ing,  noise  cancelling  in  speech,  self-tuning  filters,  and  spectral  es¬ 
timation. 


LONGINI'S  NOISE  CANCELLATION  VIA  ORTHOGONAL  BASIS  FUNCTIONS 

A  particular  type  of  adaptive  algorithm  is  based  on  orthogonal  basis 
function  representation;  it  has  thus  some  similarity  with  the  Karhunen- 
Loeve  expansion  and  may  be  viewed  as  a  communication  theoretic  approach. 

The  method  applies  to  models  of  the  structure  assumed  by  Widrow's  adaptive 
noise  cancellation.  The  solution  proceeds,  however,  in  a  different  fashion 
and  provides  fast  convergence.  For  nearly  periodic  interferences  of 
noise  with  the  signal  a  further  advantage  over  Widrow's  method  arises, 
since  nonstationarity  can  easily  be  accounted  for  by  rather  primitive 
windowing.  Due  to  the  better  convergence  properties  of  Longini's  over 
Widrow's  method,  less  danger  of  system  instability  exists.  Clearly  a 
price  has  to  be  paid  for  these  conveniences  in  terms  of  computational 
comolexity. 

Just  as  in  Widrow's  LMS-algorithm  the  linear  but  structurally  un¬ 
known  transmission  characteristics  for  noise  and  the  direct  observation 
of  the  noise  sources  are  exploited  in  a  least-squares  sense.  The  basic 
version  of  Longini's  (65)  method  is  based  on  the  selection  of  sequential 
frames  of  data;  within  each  of  these  frames  the  noise  cancellation  is 
done  independently.  The  selection  of  frames  is  often  given  in  a  very 
natural  way  by  periodic  "events"  such  as  heart  beats,  artificial  stimuli, 
or  other  oscillation. 

The  procedure  then  calls  for  the  estimation  of  noise  transmission  via 
estimation  of  covariances.  Knowledge  of  these  covariances  nay  then  be 
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exploited  to  construct  an  orthogonal  set  of  waveforms  via  the  Gram-Schmidt 
procedure.  Minimization  of  the  noise  in  the  signal  then  results  directly 
from  subtraction  of  these  orthogonal  waveforms  from  the  observed  data 
containing  the  signal.  It  is  the  use  of  the  orthogonal  set  of  waveforms 
which  leads  in  this  quadratic  minimization  to  a  one-step  "convergence." 
Since  Widrow's  method  does  not  orthogonal ize  his  waveforms,  convergence 
via  his  particular  (nonlinear  parameter)  estimation  scheme  is  slow  if 
noise  sources  are  correlated  (either  in  space  or  time— depending  on  the 
particular  problem).  On  the  other  hand,  when  noise  sources  may  be  assumed 
independent,  nothing  can  be  gained  by  Longini's  orthogonal ization  approach, 
but  much  computation  is  saved  by  Widrow's  algorithm.  Longini's  method 
requires  n(n-l)/2  correlations  for  n-correlated  noise  sources. 

For  the  purpose  of  real-time  noise  cancellation  a  fairly  simple  modi¬ 
fication  exists  for  Longini's  approach.  Instead  of  using  frames  of 
data,  an  exponential  weighting  function,  aginq  past  data  can  be  used.  In 
this  form  the  algorithm  is  considered  for  implementation  on  a  clinical 
instrument  for  removing  maternal  ECG  signals  (noise)  from  the  abdominal 
fetal  ECG  signal  (Longini  et  al.  (65)). 


ADAPTIVE  FILTERING 

The  adaptive  estimation  technique  given  in  Appendix  G,  "Adaptive  Noise 
Cancelling,"  was  predicated  only  on  several  vague  assumptions  concerning 
the  character  of  the  EEG;  specifically,  signal  and  noise  components  were 
assumed  uncorrelated.  No  particular  signal  structure  was  presumed.  In 
Appendix  E,  it  was  shown  that  the  EEG  (particularly  the  spontaneous  EEG) 
does  have  certain  definable  statistical  characteristics  and  that  autore¬ 
gressive  models  may  be  useful  in  EEG  analysis.  It  would  appear  to  follow 
naturally  that  the  VER  would  have  even  more  definable  structure  due  to  the 
controlled  nature  of  the  input  signal.  If  this  is  so,  then  it  also  follows 
that  adaptive  algorithms  should  have  more  structure  than  allowed  by  Widrow's 
method.  This  subsection  will  discuss  a  potentially  powerful  approach  to 
adaptive  filtering  using  more  highly  structured  models. 

The  basic  theory,  discussed  in  Appendix  E,  assumes  stationarity  of 
the  EEG  signals.  However,  it  is  well  known  that  the  character  of  the 
EEG  may  change  spontaneously.  Furthermore,  when  the  change  in  character 
is  induced  by  stimuli,  the  nature  of  the  change  is  of  interest. 

An  example  of  the  variable  character  of  the  spontaneous  EEG  is  shown 
in  Fiqure  G-2.  The  resultant  time-variable  spectra  are  shown  in  Figure  G-3. 
Each  curve  in  this  plot  is  a  power  spectral  density,  averaged  over  1.6 
sec,  and  taken  over  successive  1.6-sec  intervals.  Figure  G-3,  sometimes 
called  a  compressed  spectral  array  (CSA),  is  often  used  to 
analyze  the  time-varying  nature  of  biological  signals.  The  figure  can 
give  us  qualitative  information  as  to  the  nature  of  the  nonstationarities 
in  the  EEG.  However,  we  need  to  obtain  quantitative  information  in  order 
to  study  the  problem  more  precisely.  To  do  this,  we  now  turn  to  a 
discussion  of  analysis  of  nonstationary  EEG  signals. 
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Figure  G-2.  Sample  EEG. 
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Nonstationary  Models 


For  the  nonstationary  case,  it  is  useful  to  think  of  the  ARMA  model 
as  time-varying;  that  is:  the  coefficients  now  become  time  functions. 
These  functions  are  unknown,  due  to  the  unpredictability  of  the  EEG 
signal.  Thus,  we  need  a  way  of  estimating  them  from  the  data.  The  most 
powerful  and  general  methodology  for  accomplishing  this  is  adaptive 
filtering. 

Before  discussing  adaptive  filtering,  we  wish  to  remark  that  the 
ARMA  model  causes  difficulties  if  the  moving  average  parameters  also 
change  in  time.  ARMA  modeling  is  very  difficult  for  nonstationary 
processes.  Boh! in  (12)  has  developed  an  adaptive  filter  for  tracking 
the  AR  parameters,  while  keeping  the  MA  parameters  constant;  he  did  not 
consider  time-varying  MA  parameters.  It  is  possible,  however,  to  resolve 
this  problem  by  using  a  different  problem  formulation  called  state-space 
modeling.  The  state-space  models  we  will  consider  are  of  the  form: 

y(k)  =  H  x ( k )  +  r(k)  (G.l) 

x(k)  =  $(k)  x  (k-1)  +  G(k)  u(k-l)  (G.2) 


where 


y(k)  is  the  output  (measured  EEG), 
x(k)  is  the  n  x  1  state  vector, 

H  is  a  1  x  n  measurement  matrix, 

$(k)  is  an  n  x  n  state  transition  matrix,  and 
G(k)  is  an  n  x  1  input  vector. 

The  output  y(k)  may  be  a  vector.  The  noise  input  u(k)  produces  uncer¬ 
tainties  in  x(k),  while  the  noise  r(k)  acts  as  channel  noise. 

The  state  vector  x(k),  and  hence  the  output  y(k),  may  be  estimated 
using  modern  estimation  theory.  If  we  assume  that  { u ( k ) }  and  {r (k) }  are 
zero-mean  white  Gaussian  processes  with  known  second  moments,  that  x(k) 
is  Gaussian,  and  that  H,  4>(k),  and  G(k)  are  known,  then  the  best  estimator 
is  a  Kalman  filter  which  is  of  the  form 

x(k)  =  <t(k)  x(k-l)  +  K(k)  v(k)  (G.3) 

v(k)  =  ,y(k)  -  H  '!>(k)  x(k-l)  (6.4) 


Here  x(k)  is  the  minimum-variance  estimate  of  x(k),  and  v(k)  is  the 
measurement  residual,  or  innovation,  which  represents  the  new  information 
brought  in  by  the  measurement  y(k).  The  last  term  in  (b.4)  is  the 
predicted  value  of  y(k);  hence,  if  v(k)  =  0,  we  are  not  getting  any  new 
information  from  y(k).  The  matrix  K(k)  is  a  gain  matrix  which  controls 
the  rate  at  which  new  information  is  incorporated  into  the  estimate  x(k). 
It  is  computed  as  a  function  of  $(k),  H,  G(k),  and  the  noise  covariance 
matrices. 
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With  this  brief  background,  we  are  now  finally  in  a  position  to 
discuss  adaptive  filtering. 


Adaptive  Kalman  Filtering 

Since  cne  models  we  are  discussing  here  have  a  particular  parametric 
structure,  we  expect  that  development  of  adaptive  techniques  will  be  more 
complex  than  the  Widrow  algorithm.  This  is  indeed  the  case.  Adaptive 
filtering  based  on  a  Kalman  filtering  methodology  has  been  an  active 
field  of  research  for  at  least  a  decade.  A  good  review  is  given  in 
Mehra  (70). 

Perhaps  the  most  generally  powerful  technique  is  the  maximum  likeli¬ 
hood  approach,  in  which  we  attempt  to  compute  the  most  likely  set  of 
oarameters  in  time.  Bohlin  (12)  has  used  this  approach  for  EEG  analysis 
(while  restricting  his  study  to  AR  models)  and  developed  adaptive  filters 
in  a  special  integer  arithmetic  implementation  to  maximize  computational 
speed.  His  results  indicated  that  the  approach  could  provide  a  useful 
man-readable  interpretation  of  the  EEG. 

Duval  (35)  has  developed  a  more  general  adaptive  algorithm  which 
can  be  used  for  the  model  of  (G. 1 )- (G. 2) .  He  considered  only  the  problem 
of  adapting  the  gain  matrix  K(k).  More  recently,  Gustafson  and  ledsham 
(47)  have  developed  an  adaptive  filter  for  tracking  the  transition  matrix 
$(k)  in  real  time.  The  form  of  these  adaptive  filters  is  similar.  To 
illustrate  the  technique,  suppose  we  are  interested  in  tracking  only  $(k). 
Then  the  adaptive  filter  takes  the  recursive  form: 

fv(k)  =  y(k)  -  H  x'  (k) 
filter  1 

(x(k)  =  x' (k)  +  K(k)  v(k) 


adaptor 


i>*(k)  =  <t>*(k-l )  -  f(v(k)) 

t  i(k+l)  =  $(k)  +  el>*(k)  -  i(k)] 


propagation  {x * ( k+1 )  =  $(k+l)  x(k) 

The  first  two  equations  ("filter")  incorporate  the  measurement  y(k)  into 
the  state  estimate  x(k).  The  quantity  x'(k)  is  the  predicted  value  of 
the  state  x(k)  prior  to  incorporating  the  measurement  y(k). 

Adaptation  of  4>(k)  takes  place  in  two  steps.  First,  the  optimal 
estimate  $*(k)  is  found  using  the  maximum  likelihood  equations.  The 
function  f (v(k) )  is  linear  in  v(k).  Next,  the  estimate  $(k+l)  is  computed 
using  an  update  rate  parameter  8.  If  8=1,  then  $(k+l)  =  **(k).  If 
8  =  0,  4>(k+l )  =  $(k)  and  the  estimate  does  not  change  from  its  previous 
value.  Thus  8  controls  the  speed  of  adaptation. 

The  final  step  is  propagation  of  the  state  estimate  using  the 
new  estimate  $(k+l). 
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This  algorithm  is  easily  extended  to  include  simultaneous  adaptation 
of  $(k)4  K(k),  and  G(k),  although  the  equations  become  more  complex.  As 
mentioned  previously,  S^I  has  applied  these  algorithms  to  other  problems 
and  is  quite  experienced  in  their  use. 


PIECEWISE  STATIONARY  MODELING 

Another  approach  to  the  analysis  of  nonstationary  signals  is  to 
seqment  them  into  stationary,  or  quasi -stationary,  segments.  It  is  un¬ 
doubtedly  true  tnat  the  nonstationarity  of  the  VER,  using  almost  any 
nonstatior.arity  measure,  increases  as  the  data  epoch  increases.  It  has 
been  experiment  ally  verified  (e.g.,  McGillem  and  Aunon  (69))  that  seg¬ 
menting  the  VER  and  using  latency-correcting  techniques  for  each  segment 
results  in  higher  signal  energies  and  more  sharply  defined  responses. 

More  recently,  Seqen  and  Sanderson  (100)  have  used  piecewise  stationary 
autoregressive  models  for  the  spontaneous  EEG.  The  data  were  segmented 
using  a  cluster  analysis  of  the  model  parameters.  Their  results  demon¬ 
strate  clearly  the  possible  improvement  in  signal  tracking  attainable  using 
segmentation  of  the  EEG.  This  same  technique  should  be  applicable  as  well 
to  the  modeling  of  the  VER  (cf.  Figure  G-2  for  an  example  of  the  non¬ 
stationarity  of  the  VER). 


ARTIFACT  DETECTION  AND  ROBUSTNESS 

Large  real-world  data  sets  will  always  tend  to  be  corrupted  by 
artifacts,  some  of  which  cannot  be  accounted  for.  When  the  occurrence 
of  artifacts  is  rare  and  of  little  power,  they  may  be  ignored.  In  VER/EEG 
analysis,  however,  rather  large  artifacts  such  as  from  frequent  saccades 
of  the  eye,  periodic  blinking  of  the  eyelid,  and  loosening  of  electrodes 
may  occur.  Analysis  of  VER/EEG  data  should  include  these  effects. 

Robustness  expresses  the  concept  of  good  system  performance  when 
structured  or  other  deviations  +rom  the  assumed  model  arise.  The  concept 
has  received  much  attention  in  co-u^l  theory  and  statistics  in  the  last 
decade.  The  objective  when  making  a  particular  procedure  robust 
is  to  trade  little  of  known  good  properties  of  a  partic^ar  procedure 
against  resistance  to  model  errors.  T»,«  general  theoretical  treatment 
is  very  hard  or  even  intractable  in  all  but  the  most  trivial  situations. 
Consequently  investigation  of  robustness  properties  is  guided  by  consid¬ 
eration  of  limiting  cases,  possibly  exoressed  in  bounds  and  typically 
checked  by  simulation.  Usually  only  robustness  against  a  few  types  of 
model  deviations  is  accomplished,  but  never  against  all. 

Applied  to  VER/EEG  analysis  this  could  mean  an  algorithm  is  capable 
of  continued  good  performance,  despite  saccades,  blinks, or  other  erratic 
events;  possibly  the  algorithm  might  also  cope  with  an  occasionally 
misplaced  electrode.  The  robust  scheme  will  typically  "suspect"  or  "detect" 
model  deviations  and  reduce  the  weight  in  considering  such  data.  A 
good  example  of  such  a  robust  method  is  given  Dy  Athans  et  al .  (7) 
where  a  ballistic  reentry  vehicle  produces  an  ionic  wake;  when  observing 
the  vehicle  by  radar,  the  wake  may  also  reflect  the  radar  beam  resulting 
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in  erratic  measurements.  Based  on  a  likelihood  argument,  the  well-known 
Kalman  filter  algorithm  is  modified  to  be  "cautious"  in  incorporating 
such  erratic  data.  In  a  comparison,  the  "optimal"  Kalman  filter  algorithm 
lost  track  of  the  vehicle;  the  robustified  version  did  not.  Furthermore, 
when  there  was  no  wake,  there  was  little  difference  in  the  performance 
of  the  two  versions  of  the  algorithm. 

Lately,  the  concept  of  robustness  is  also  combined  with  the  concept 
of  adaptivity.  For  examole,  one  would  like  to  make  a  procedure  more  robust 
when  there  is  evidence  of  model  deviation,  but  approach  the  optimal 
method  if  there  is  no  such  indication.  Some  theoretical  work  in  this 
direction  has  been  done  by  Prescott  (82)  for  adaptive  trimming 
proportions  for  the  estimation  of  means. 

Some  experience  related  to  adaptive  trimming  in  dynamical  system 
was  also  gained  by  one  of  the  authors.  There  the  problem  arose  to 
describe  dynamic  fluctuations  (arrhythmia)  of  the  fetal  heart  rate  in 
the  presence  of  erratic  artifacts  due  to  maternal  heart  beat,  uterine 
contraction  durinq  labor,  and  electrode  imperfections. 

Examples:  Adaptive  Spectral  Line  Enhancing 

Adaptive  methods  are  quite  useful  in  extracting  periodic  components 
from  broadband  noise.  For  example,  this  approach  could  be  used  to  track 
VER  frequencies  at  or  near  the  input  frequency. 

As  a  result  of  random,  unknown  modulation  effects  within  the  brain, 
the  counterphase  frequency  component  may  be  changed  slightly  within  the 
measured  VER.  This  "detuning"  is  something  we  would  like  to  be  able  to 
track.  As  a  simple  example,  suppose  that  the  counterphase  frequency  is 
f q=2tt/(Dq  Hz.  The  fundamental  signal  component  is 


Sg(t)  =  COS  0)gt 

Now  assume  that  a  phase  modulation  of  the  form 
<t( t)  =  aSin  a>gt 

is  introduced.  The  signal  then  becomes 
S(t)  =  cos(u>gt  +  a  sin  aigt) 

The  Fourier  coefficient  at  frequency  fg  is 
afg(a)  =  Jg(a)  -  J  J^a) 

where  J  (a)  is  the  Bessel  function  of  order  v.  For  a  small 
v 

(a)  -  1  -  0.75a  -  0.1 59«^ 

T0 

The  Darameter  a  represents  the  maximum  phase  deviation  one  would  expect 
over  one  counterphase  cycle.  For  example,  McGillem  and  Aunon  (69)  found 
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nhase  deviations  of  uo  to  20  msec  over  VER  segments  of  100  msec.  For 
a  counterphase  frequency  of  8  Hz  this  gives  a  =  .16  and 

a-  (.16)  -*  .88 
f0 

Thus,  expected  small  latency  shifts  give  rise  to  significant  reductions 
in  output  signal  power  (here  23%)  at  the  counterphase  frequency. 

Frequency  trackinq  may  be  accomplished  by  the  Adaptive  Line  Enhancer 
(ALE)  (cf.  Tufts  et  al.  (110)).  An  example  of  the  performance  improvement 
relative  to  the  TFT  is  shown  in  Figure  G-4.  The  probability  of  detecting 
a  constant  frequency  signal  in  wideband  Gaussian  noise  is  plotted  vs. 
input  signal/noise  ratio  for  several  values  of  false  alarm  probability 
(Pc/^).  Performance  improvement  at- low  Pp^  is  quite  significant. 


Figure  G-4.  A  comparison  of  receiver  operation  characteristic 
(ROC)  curves  of  the  conventional  and  ALE  detectors. 

(after  Dentino  et  al. ,  Proc.  IEEE  Conf.  on  Decision  and  Control, 
p.  1377,  1978.) 
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APPENDIX  H 


EXPERIMENT  DESIGN 

The  desiqn  and  operational  details  of  a  VER  experiment  can  have  a 
significant  impact  on  the  effectiveness  of  signal  Drocessing.  In  particu¬ 
lar,  critical  parameters  in  the  signal  source,  subject  condition,  and 
experimental  procedure  should  be  regarded  as  valuable  data  inseparable 
from  the  measured  EEG.  These  parameters  mav  be  classified  as  a)  controlled, 
b)  uncontrolled  but  measured,  and  c)  important  but  unmeasured.  Of 
course,  the  same  parameter  may  be  controlled  in  one  experiment  but  an 
unmeasured  disturbance  in  another.  Desmedt  (28)  contains  (in  chapter  I 
on  methodology)  a  good  description  of  such  parameters  in  VER  research. 

The  first  group  of  parameters  contains  all  of  the  factors  directly 
controlled  by  the  experimenter.  These  parameters  include  a)  stimulus 
details,  for  example,  color,  pattern  characteristics,  flash  frequency, 
and  eye  illumination;  b)  subject  details,  for  example,  concentration  points 
and  task  performance  during  the  experiment;  and  c)  measurement  details, 
such  as  electrode  placement  and  characteristics,  analoq  processing  before 
digitization,  and  digitization  parameters. 

Other  parameters  may  be  equally  Important,  although  uncontrolled 
(or  even  uncontrollable)  but  measured.  Included  in  this  category  are 
subject  blinking  or  eye  movement,  time  of  day  (patient  awareness),  and 
electrode  impedance  characteristics.  The  effect  of  some  parameter 
variations  may  be  reduced  rather  than  measured,  by  proper  experiment 
desiqn.  For  instance,  impedance  variations  may  be  reduced  by  using  a  high- 
impedance  amplifier  at  the  sacrifice  of  noise  level  (see  Appendix  B). 

The  last  group  of  parameters,  uncontrolled  and  unobserved,  contains 
the  most  troublesome  variables  for  the  experimenter.  Subject  attention 
and  focal  point,  random  stimulus  variations  (e.g.,  frequency  jitter), 
pupil  dilation,  and  electrode  noise  can  all  add  unaccounted  variability 
to  an  experiment  session.  The  potential  damage  is  much  worse  when  the 
data  Is  processed  long  after  it  Is  taken,  possibly  ending  in  a  wasted 
session  rather  than  merely  one  bad  run.  Data  processing  techniques  which 
give  instant  results,  not  unlike  instant  pictures,  are  quite  valuable 
even  if  inferior  in  quality  to  post-processing. 

Proper  experiment  desiqn  may  test  for  the  sensitivity  of  the  results 
to  any  questioned  parameter  and  modify  the  experiment  when  necessary. 

For  example,  a  simple  VER  extraction  technique  (for  removing  the  back¬ 
ground  EEG)  in  real  time  can  be  used  to  position  electrodes  for  maximum 
response  and  may  reduce  day-to-day  variability  in  the  data,  thus  providing 
better  data  for  later,  more  sophisticated,  oost-processing. 

Experiment  desiqn  may  also  prevent  unmeasured  disturbances  from 
corrupting  the  data.  For  example,  a  Maxwellian  view  technique  may  be 
used  to  prevent  random  pupil  dilations  (see  Appendix  B)  from  unintentionally 
amplitude-modulating  the  stimulus.  Another  example  is  the  possibility 
of  (deliberately)  frequency-modulating  the  pattern  reversal  rate  in  the 
a  wave  region  to  prevent  the  EEG  from  locking  to  the  stimulus.  If  the 
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resoonse  to  a  blindinq  flash  is  measured  by  sychronously  demodulating 
(see  Appendix  D)  the  pattern  response,  a  much  different  result  from 
constant-rate  experiment — with  EEG  entrainment — may  be  obtained.  This 
may  permit  EEG  entrainment  time-constants  to  be  distinguished  from 
flashblindness  recovery  times. 
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APPENDIX  I 


EVALUATION  OF  PERFORMANCE 

Evaluation  of  the  performance  of  u  signal-processing  method  or 
experiment  is  inherently  based  on  costs  and  benefits  (negative  costs, 
also  considered  utility).  Ideally,  an  evaluation  should  aid  in  modifying 
the  structure  analyzed,  so  as  to  improve  overall  utility  or  reduce  costs. 

A  difficulty  exists  in  that  utility  and  cost  are  usually  not  fixed  as  a 
project  progresses,  and  often  they  are  hard  to  specify  at  all.  Thus  the 
best  one  can  do  initially  is  to  discuss  aspects  related  to  these  costs  as 
they  pertain  to  the  balance  between  signal  analysis  ana  other  research 
objectives. 

Typically  simpler  structures  can  be  evaluated  more  objectively, 
complex  problems  more  subjectively.  In  this  appendix  we  will  start  out  with 
a  discussion  of  simple  structures  and  turn  to  more  complex  situations. 


EVALUATION  OF  PERFORMANCE  OF  SIMPLE  STRUCTURES 

Earlier,  in  Appendix  A,  we  discussed  the  general  philosophy  of  signal 
analysis  as  a  task  to  separate  information  in  data  from  its  random 
component.  The  separation  is  rarely  complete;  in  many  cases,  complete 
separation  is  not  even  necessary  to  meet  one's  goals,  while  in  other  uses 
it  may  be  mandatory. 

Special  tools  have  been  developed  in  statistics  to  identify  when 
separation  is  nearly  complete.  The  tests  concern  the  residuals  of  the 
model:  that  random  part  not  accounted  for  by  the  model.  A  variety  of 
tests  for  the  mutual  independence  of  residuals  (from  one  sample  point 
to  another)  have  been  developed,  each  with  a  specific  diagnostic  value 
and  power  to  discriminate  against  certain  alternatives.  For  Gaussian 
distributions,  which  are  very  important  in  much  of  signal  analysis,  a 
test  for  uncorrelated  residuals  is  equivalent  to  a  test  for  their  mutual 
independence. 

For  example,  consider  an  EEG  tracing.  We  select  a  window-subsection 
in  order  to  fit  a  model  and  would  like  to  detect  when  in  subsequent 
windows  a  significant  model  change  occurred,  so  that  we  may  update  our 
model.  In  this  case,  we  may,  for  example,  simply  compute  the  one-lag 
autocorrelation  value  of  the  residuals  for  any  of  these  new  data  windows; 
when  a  critical  value  is  exceeded  we  are  willing  to  proceed  with  the 
poss’oly  costly  (in  terms  of  computer  time)  reestimation  of  our  model. 

The  importance  of  mutually  independent  residuals  is  intuitively  very 
appealing  in  case  of  sequences  of  events,  such  as  in  time  series.  For 
example,  when  we  forecast  an  observation  based  on  present  and  past,  and 
the  difference  between  our  forecast  and  the  eventual  observation  is 
independent  of  all  our  knowledge  (including  all  past  forecasts  and 
observations),  we  have  done  the  best  possible  job--all  structural  information 
in  the  ongoing  process  is  known  to  us.  It  is  only  the  randomness  of  nature 
which  surprises  us  and  creates  an  innovation.  On  the  contrary,  if  the 
difference  between  forecast  and  eventual  observation  depends  on  the  past, 
we  could  use  this  dependency  to  improve  our  forecast;  hence  we  did  not 
use  the  optimal  scheme. 
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The  importance  of  this  concept  was  emphasized  by  Wiener  (116). 

In  the  case  of  linear  dynamical  structures  and  Gaussian  densities,  testing 
for  uncorrelated  innovation  is  equivalent  to  testing  for  optimality  of 
the  scheme.  Such  uncorrelated  sequences  are  usually  called  white  sequences 
(they  need  not  be  Gaussian)  in  resemblance  to  the  stochastic  properties 
of  white  light.  Observe,  that  colored  light  such  as  from  lasers  is 
hiqhly  structured  as  expressed  by  the  coherence  properties.  Again,  in 
resemblance  to  colored  light,  (innovation)  sequences  which  are  correlated 
and  hence  contain  much  structure  are  regarded  as  colored  (see  Sage  and 
Mel sa  (94,  95)). 

For  cases  other  than  linear  structures,  checking  correlation  is  in 
general  insufficient  to  detect  dependency.  But  in  many  statistical 
models  local  linearizations  are  possible  and  hence  "moderately"  nonlinear 
problems  may  still  be  analyzed  by  correlation  procedures.  Possibly  one 
might  also  use  Dewan's  (31)  generalized  procedure.  Thus  testing  cor¬ 
relation  remains  one  of  the  most  important  and  also  diagnostic  procedures 
to  establish  optimality  of  a  scheme. 

An  alternative  to  checking  for  the  "optimality"  of  a  signal-proces¬ 
sing  scheme  in  terms  of  residuals  is  to  look  at  the  usefulness  of  one's 
scheme  to  express  gross  features.  For  example,  the  second-order  (linear) 
AR-model  describes  gross  oscillations  in  an  EEG  waveform  but  does  not 
account  for  the  asymmetry  in  that  particular  waveform.  When  these 
oscillations  entrain  another  mechanism,  it  may  not  be  very  interesting 
to  have  the  "optimal"  model.  It  may  be  sufficient  to  specify  roughly  the 
(possibly  somewhat  drifting)  modulating  power  and  frequency  of  these 
oscillations  to  predict  the  behavior  of  the  entrainment. 

For  other  pumoses  one  miqht  not  be  interested  in  the  predictive 
value  of  a  scheme,  but  as  in  pattern  recognition,  one  wishes  to  have  a 
parameterization  of  observation  which  allows  separation  into  different 
classes,  sometimes  even  into  distinct  clusters.  When  such  separation 
is  accomplished  with  low  enough  error  rates,  one  may  well  regard  a  parti¬ 
cular  scheme  as  good.  Clearly,  when  error  rates  are  not  low  enough, 
examination  of  model  residuals  may  tell  whether  there  is  (still)  possibly 
useful  information  left  in  the  original  data  which  might  be  "extracted" 
by  improved  modeling. 


SUBJECTIVE  EVALUATION  AND  EVALUATING  A  LARGE  SYSTEM 

Subjective  performance  evaluation  will  often  occur  in  preliminary 
evaluation  of  simple  structures  or  will  result  from  subjective  cost 
structures.  For  the  preliminary  evaluation  of  simple  models,  plots  of 
model  residuals  and  their  visual  (subjective)  evaluation  are  very 
important;  a  good  treatment  of  this  topic  is  given  by  Draper  and  Smith  (32). 
As  structures  become  more  complex,  their  evaluation  requires  approximations 
(they  are  subjective)  especially  because  of  nonlinear  interactions  (such 
as  limitation  and  decisions)  of  system  components  and  changing  objectives. 

The  most  important  performance  measures  of  large  systems  we  will 
use  are  based  on; 
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1.  Reproducibility 

2.  Speed  of  processinq--real  time  versus  off  line 

3.  Numerical  stability  and  sensitivity 

4.  Automatic  versus  manually  supervised  operations 
(selection  of  starting  values) 

5.  Robustness  against  artifacts 

6.  Resistance  to  operator  command  errors 

7.  Structural  clarity—e.g. ,  relation  of  parameters  to 
physical  or  physiological  processes 

8.  Ease  of  modification  (e.g.,  model  changes) 

Their  mutual  weighting  in  the  evaluation  of  a  particular  component  is 
essentially  a  quite  subjective  task. 

In  an  environment  with  the  goal  to  improve  these  essentially  sub¬ 
jective  performance  measures,  we  ask  first  for  diagnostic  procedures  to 
detect  components  which  perform  poorly.  Detection  of  such  components 
(e.g.,  a  spike/wave  detector)  and  their  importance  in  the  overall  scheme 
will  tend  to  be  more  valuable  (because  of  simplicity)  than  a  complete 
ranking  of  the  performance  of  all  components:  delineation  of  poorly 
performing  components  invites  immediate  treatment.  In  view  of  changing 
objectives  and  costs  it  is  unlikely  that  the  effects  of  a  poorly  performing 
component  will  improve  with  time.  Hence  this  approach  of  detecting  "bad" 
components  is  one  of  the  important  aspects  of  the  philosophy  of  trouble 
shooting  which  ultimately  may  reduce  costs  or  improve  utility. 

Then,  once  all  "bad"  components  have  been  taken  care  of,  one  may 
proceed  to  examine  more  carefully  the  cost  effectiveness  of  components. 
Possibly  one  determines  also  their  relations  and  the  potential  to  tune 
components  in  terms  of  tradeoffs. 
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APPENDIX  J 

OVERVIEW  OF  APPLICATIONS 

The  varied  techniques  of  siqnal  processing  discussed  earlier  in 
this  report  can  be  applied  at  several  points,  and  on  several  levels  in 
VER  experimentation.  This  appendix  outlines  some  obvious  applications 
and  discusses  some  of  the  more  promising  processing  techniques  for  each 
area. 


OVERALL  EXPERIMENTATION 

The  basic  exDerimental  process  may  be  structured  as  in  Figure  J-l. 
The  first  steD  in  the  design  of  an  experiment  is  to  define  objectives; 
i.e.,  to  soecify  what  one  hooes  to  learn  in  the  experiment.  With  these 
objectives  in  mind,  one  forms  a  model  for  the  process  under  investigation 
and  then  designs  an  experiment  to  test  the  model.  The  actual  preparation 
and  conduction  of  the  experiment  may  include  the  minor  feedback  loop, 
from  data  analysis  to  experiment  set-up,  as  shown.  This  loop  represents 
the  real-time  use  of  signal  processing  in  "calibrating"  the  experiment  or 
reducinq  unwanted  variations  in  experimental  conditions,  for  example,  the 
use  of  EEG  amplitude  measurements  to  repeatably  place  electrodes. 


DATA  PROCESSING  AND  ANALYSIS  FOR  VER  EXPERIMENTS 

The  data-orocessina  block  of  Fiqure  J-l  contains  the  usual  functions 
of  signal  analysis  in  experimentation.  For  VER  experiments,  this  block 
may  be  subdivided  into  several  subtasks,  each  of  which  may  be  accomplished 
by  a  different  signal-processing  technique.  The  subdivision  will  also 
be  different  for  "transient"  (single  flashes  or  patterns)  testing  than 
for  "steady-state"  (pattern  reversal)  experiments. 
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Steady-State  Experiments 


For  steady-state  experiments,  the  measured  data  from  one  or  more 
electrodes  will  be  used  to  compute  the  instantaneous  spectrum  of  the 
EEG  (or  a  single  component  of  the  spectrum  at  the  pattern  reversal 
frequency).  This  instantaneous  spectrum  may  then  be  tracked  in  time, 
and  any  chanqes  (especially  in  response  to  blinding  flashes)  noted. 

The  first  task— that  of  computing  the  instantaneous  spectrum— is  a 
problem  in  spectral  estimation  as  discussed  in  Appendix  C.  It  is  particu¬ 
larly  difficult  in  this  experiment  because  the  spectrum  is  chanainq 
with  time,  and  one  must  inevitably  trade  off  the  accuracy  of  larger  data 
windows  against  the  error  caused  by  spectral  changes  during  the  window. 


Once  the  instantaneous  spectrum  is  computed,  the  modeling  and  tracking 
of  the  spectrum  changes  (in  response  to  a  stimulus)  may  be  more  appro¬ 
priately  handled  in  the  time  domain  as  discussed  in  Appendix  E.  This  two- 
part  analysis  is  shown  in  Figure  J-2. 
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Figure  0-2.  Steady-state  analysis. 


Transient  Experiments 

In  transient  experiments,  the  EEG  is  measured  at  one  or  more  scalp 
locations,  and  the  data  is  used  to  identify  a  brain  response  to  a  visual 
stimulus.  The  data  processing  may  be  separated  into  two  parts,  as  above, 
but  in  this  case  the  goals  are  decidedly  different.  The  first  task  is  to 
take  the  measurements  and  extract  the  evoked  response  from  the  background 
(spontaneous)  EEG.  Several  different  techniques  for  this  removal  of  the 
EEG  may  be  considered.  The  second  task  is  to  analyze  the  VER,  as  shown  in 
Fiqure  J-3. 

Traditionally,  the  backqround  EEG  is  averaged  out  by  superimposing 
several  responses  all  synchronized  by  the  stimulus  times.  This  technique, 
unfortunately,  removes  some  high-frequency  VER  information  and  ignores 
any  response  (VER)  change  from  one  stimulus  to  the  next.  Two  other  tech¬ 
niques  seem  useful  for  VER  extraction  and  do  not  possess  these  drawbacks. 
The  first  would  use  time  series  filtering  (e.g.,  ARMA)  to  track  the  EEG 
before  the  flash  and  subtract  an  estimated  EEG  from  the  measurements  during 
the  expected  response.  The  residual  should  be  the  VER  alone.  The  second 
approach  would  use  multi-lead  information  (ideally  one  lead  with  EEG  plus 
VER  and  one  with  EEG  alone)  to  extract  the  VER  via,  for  example,  Widrow's 
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Figure  J-3.  Transient  analysis. 

method  (see  Appendix  G).  The  first  suggested  approach  requires  temporal 
correlation  of  the  background  EEG,  while  the  second  looks  for  spatial 
correlation  of  the  EEG  (but  not  VER). 

Once  the  (pure)  VER  is  obtained,  any  manner  of  signal  processing, 
feature  extraction,  and  pattern  recognition  techniques  may  be  used  to 
examine  the  stimulus-response  behavior.  The  choice  of  technique  will 
depend  on  the  stimulus  and  response  characteristics,  but  should  be  greatly 
facilitated  by  the  visibility  of  the  VER  after  its  extraction  in  the  above 
stage. 


APPENDIX  K 

CURRENT  AND  CLASSIC  SIGNAL  PROCESSOR  PERFORMANCE 

This  appendix  derives  some  of  the  numerical  results  used 
"Analysis  of  Current  Processing."  We  begin  by  noting  a  number 
for  a  Gaussian  random  varable  (x)  with  mean  m  and  variance  a2, 

2 

x  % N(m,  a  ) 

Then  the  first  tour  moments  of  x  are 


x  =  m 


T  2  ,  2 
x  =  m  +  a 

3.-22 
x  -  m  +  3m  o 

"T  4  _  ,  4  .  ,2  2 
x  am  =  3  o  +6mo 

We  consider  the  spectral  estimate  s: 
2  2 

S  =  X-j  +  x^  « 

2 

where  x^  ^ N(m,  a  ),  i  =  1 ,2 
x^  and  are  independent,  and 

m  =  \  /PT 


o  =  Q/2 


Mean  of  s 

The  mean  of  the  random  variable  s  is 

s  =  E(x2)  +  E{x2)  =  2(m2  +  a2) 
and,  substituting  from  above 


Variance  of  s 


We  may  find  the  variance  of  s  from  the  relation 
Var  s  =  E[(s-s)2]  =  E(s2)  =  s2 


From  above  we  know  that 

s2  =  4(m2  +  a2)2 


in  the  section 
of  relations 
denoted  by 
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We  have  that 


E(s2)  =  E(x4  +  2x^x2  +  x4) 

=  2Ex4  +  2(Ex2)2 

=  2(3 a4  +  m4  +  6m2a2)  +  2(a2  +  m2)2. 

Then 

Var  s  =  2  [  3o4  +  m4  +  6m2a2  -  (a2  +  m2)2  ] 
=  4  o4  +  8m2  a2. 

And,  substituting  for  m  and  a, 

Var  s  3  Q2  +  P  T  Q 
Classic  Spectral  Estimation 

The  classic  technique  computes  N  estimates 

sn  =  xl  +  4  »  "  = 

n  n 

from  the  N  windows: 

z(t+n),  t  e  (0,T),  n  = 

For  each  window,  the  components  at  4  Hz  are 

o 

x.  ^  N(m,  o  ) 

7n 

where 

m  =  i  /FT 

a2  =  Q/2 

and  then 


i  •'i 


is  the  spectral  estimate.  We  wish  to  compute  the  mean  and  variance  of  s. 
Mean 


and. 


Therefore 


1 

N 


S  -  i.  l  sn 


N 

l 

n=l 


s  =  x?  +  x?  =  2(m2  +  o2) 
n  n  2n 


S  -  y  +  Q. 


Variance 


—  2  ~7 

To  compute  the  variance  of  s  we  need  (s)  and  $  . 


N  N 


52  =  -7  F  F  S  S 

fr  n=l  r=1  n  r 


1  J 


and 


Therefore 


E  [  sn  sr ]  =  E  (  x2  +  x2  )  (x2  +  x2  )  *  4  E  x2  x2  «  4(o2  +  m2)2 
n  n  r  r 


E  [  { sn)2 ]  =  E  (  x^  +  2  x2  x|j  +  x!| )  *  2(2o^  +  4m2o2)  +  4(m2  +  a2' 


!7  _  1  r  ?  „  2  ,  _2V2 


N 


s'  =  -V  7  J  4(o'  +  m')'  +  \  l  2(2o4  +  4m2o2) 

fr  n=l  r=1  N'  n=l 


and 


=  4(n2  +  m2)2  +  ~  (4o4  +  8m2o2) 


.  t 


i  1 

!  ! 
1  x 


so  that 


.*"2  2  2x2 

(s)  ••  4(m  r  0  ) 


„  1  a  2  2 

Var  s  =  g  (4o  +  8m  0  ) 


and,  substituting  for  m  and  a,  we  have 


Var .  3_i_EJ2  . 


B55SB>jK3?5i3(PE3H 


n  i  r 

2 


117 


Consider  the  following  measure  of  a  spectral  estimator: 


m  -  standard  deviation 
M  "  mean 


where  the  lower  M  is,  the  better.  Then  the  current  estimator  has 


M  = 
c 


jj  H 

pi  +  a 

2  N 


1/2 


and  the  classic  estimator  has 

M  rj  (q2  +  p  t  q]2 

Cl 


Pi  *  Q 


We  note  in  passing  that  the  current  estimator  performance  is  inde 
pendent  of  T  (the  window  length),  and  depends  only  on  the  total  record 
length  NT.  To  see  this  we  let  the  record  length  be 


L  =  NT 


Then 


Of  course,  different  window  lengths  result  in  different  frequency  resolu¬ 
tions  and  thus  different  responses  to  noise  outside  of  the  4-Hz  band  that 
we  consider.  Thus  this  result  should  be  approached  with  caution. 

An  interesting  point,  however,  is  that  the  classic  estimator  perfor¬ 
mance  improves  as  more  windows  are  taken.  This  improvement  is  limited, 
however,  by  the  low-frequency  resolution  available  at  short  window  lengths 
and  by  the  eventual  correlation  (lack  of  whiteness)  of  the  noise  for  short 
times. 

Finally,  we  are  ready  to  consider  the  ratio  of  the  measures 
n  =  Mcl/Mc 

where  small  n  (<1)  implies  the  classic  technique  is  better  while  large 
o  (>1)  favors  the  current  method.  We  let  T  and  N  be  the  same  for  both 
techniques,  so  that  frequency  resolution  and  total  data  lengths  are  the 
same  for  both. 


We  have, 


[{1/N)(Q2  +  PTQ)] 

1/2 

(PT/2)  +  (Q/N) 

C(Q2/N2)  +  (Q/N)PT]  j 

1 

(PT/2)  +  Q 

Letting  a  represent  a  signal  to  "noise"  ratio 
a  =  PT/2Q 

we  have 


(l/N)(l+2a) 

(1/N)  +  a 

.  (1/N2)  +  (2a/N)  . 

1  +  a 

1  +  2a  ] 

f  (1/N) +  a 

(1/N) +  2a 

1  1  +  ct 

2 

As  shown  in  Figure  K-l ,  n  is  always  less  than  1,  although  for  large  a, 

n  -*■  1.7  Thus,  if  a  pure  sinusoid  of  sufficient  power  is  present,  both 
technioues  will  give  equally  good  performance.  For  low  a,  however,  the 
classic  technique  is  better  by  a  factor  of 

n  =  1//FT. 


7  2 

For  convenience  the  asymptotes  of  the  ln(n  )  versus  lna  have  been 

plotted  by  analogy  with  Bode  techniques. 
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gure  K-1.  Asymptotes  of  relative  performance  measure. 


APPENDIX  L 


A  MODEL  FOR  WEAK  HIGH-FREQUENCY  COMPONENTS 
OF  TRAVELLING  WAVES,  MODIFIED  FROM  LINDSTROM  (64) 

Lindstrbm's  model  Is  based  on  a  Fourier  analysis  and  Coulomb's  law  for 
quasi-stationary  fields  (wave  propagation  velocity  v«  c  =  l//ey)-  The  ex¬ 
planation  we  give  here  is  a  simplified  qualitative  geometric  consideration. 

We  may  start  out  by  considering  a  wave,  such  as  an  action  potential, 
as  composed  of  a  continuum  of  Infinitely  long  sinusoidal  waves.  For  simpli¬ 
city,  we  will  only  consider  far-field  effects  (h*x),  as  shown  for  two 
sinusoidal  waves  in  Figure  L-l . 


h  >>  x. 


wave  1 


A" 


x  wave  2 

Figure  L-l.  The  far-field  contribution  of  two 

waves  with  different  spatial  frequencies. 

From  Coulomb's  law  we  realize  that  the  contribution  of  the  positive 
halfwave  #1  (darkened  area),  shown  in  the  Figure  L-l,  will  decrease  like 
1/r  as  the  distance  of  that  halfwave  #1  from  the  point  P  is  increased. 
However,  the  contribution  of  the  wave  #2  within  the  window  shown  (with 
twice  that  spatial  frequency)  will  simultaneously  decrease  like  l/r^  since 
it  contributes  a  dipole.  Correspondingly  the  contribution  of  higher  spatial 
frequency  components,  say  of  order  p,  will  decrease  like  l/r.  However, 
these  higher  spatial  frequency  components  are  also  the  components  generating 
the  hiqh-frequency  components  in  time,  since  they  all  travel  with  the  same 
velocity  as  the  underlying  wave  (such  as  the  action  potential).  A  drastic 
decrease  of  high-frequency  component  due  to  dealing  with  waves,  all  traveling 
with  constant  velocity,  is  thus  to  be  expected.  Integration  (Lindstrom 
and  Maqnusson  (64))  for  the  infinitely  long  structure  yields  for  the 
far-field  "transfer  function" 

'i'(juj)  --  C[KoUh/v)]  exp(-ju)X0/v), 


where  K  is  the  Bessel  function  of  the  second  kind.  For  large  arguments, 
one  can°use  the  approximation  KQ(u)  =  U/2U )''2  exp(-u),  showing  a  pre¬ 
dominantly  exponential  decline  of  the  transfer  with  increasing  frequency. 
Note,  there  is  no  need  to  consider  only  a  single  fiber  for  this  model; 
there  could  be  many  synapsing  fibers  in  series  giving  rise  to  the  propagation 
of  a  potential  wave.  Qualitatively,  the  above  equation  shows  how  quickly 
with  increasing  distance  h  and  increasing  frequency  u>  the  transfer  of 
potentials  decreases.  This  result  is  interesting  since  it  suggests 
the  possibility  to  tune  sensor  electrodes  to  nearby  sources  by  selecting 
high-frequency  components.  With  such  a  goal  in  mind,  methods  to  reduce 
electronic  noise  of  currently  used  equipment  become  important  (see 
subsection  "Frequency-Dependent  Properties  of  Macroelectrodes  (and 
Amplifiers)"). 
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APPENDIX  M 


PATTERN  RECOGNITION  TECHNIQUES 

Pattern  recognition  techniques  have  been  developed  to  provide  com¬ 
puter  assistance  to  the  problems  of  analyzing,  detecting,  recognizing, 
and  describing  patterns  in  apparently  erratic  data.  As  a  result,  an 
entire  field  of  study  has  evolved  which  has  found  application  in  diverse 
fields,  including  engineering,  computer  science,  biology,  psychology,  and 
medicine.  The  techniques  of  pattern  recognition  have  been  especially 
useful  in  the  biomedical  area,  due  to  lack  of  appropriate  physical  models 
with  which  to  describe  the  processes  of  interest. 

Pattern  recognition  is  generally  divided  into  several  sequential 
steps,  as  illustrated  in  Figure  M-l .  The  raw  data  is  first  conditioned 


Figure  M-l.  Information  flow  in  typical  pattern  recognition 
process, 

(e.q.,  remove  unwanted  frequency  components  or  artifacts',  with  the 
proviso  that  no  information  is  lost  in  the  process.  The  next  step  is 
feature  extraction,  where  the  desire  is  to  extract  a  minimal  set  of 
information-bearing  parameters.  This  step  may  be  viewed  as  a  process  of 
data  compression.  The  final  step  is  the  classification  process,  where 
decision  rules  are  utilized  to  classify  the  features.  Since  the  features 
contain  (ideally)  the  same  information  as  the  raw  data,  the  classifier 
actually  classifies  the  raw  data. 

We  discuss  briefly  here  aoproaches  to  feature  extraction  and  pattern 
recognition  which  might  be  particularly  useful  for  VER  analysis. 


FEATURE  EXTRACTION 

Perhaps  the  most  important  part  of  any  pattern  recognition  scheme 
is  the  feature  extractor.  In  this  appendix,  a  mathematical  model  will  be 
derived  which  will  generate  signals  that  closely  approximate  measured 
VERs.  The  model  can  account  for  the  information-carrying  signal  compon¬ 
ents,  along  with  the  various  noises  corrupting  them.  The  model  can  be 
derived  from  a  traininq  set  via  the  Karhunen-Loeve  expansion  technique. 
The  coefficients  of  the  expansion  then  become  the  features  of  the  VER. 

A  model  of  the  VER  must  be  able  to  account  for  the  variations  from 
cycle  to  cycle  (i.e.,  each  time  the  stimulus  pattern  is  repeated)  in 
both  amplitude  and  period  (stimulus  rate).  Moreover,  it  must  be  able  to 
take  advantage  of  the  possibly  strong  correlation  between  signals  from 
different  electrodes.  Any  variation  in  the  stimulus  rate  means  that  the 
time  origin  must  be  reset  with  onset  of  each  stimulus.  The  model  to  be 
developed  attempts  to  describe  the  VER  and  the  identifiable  sources  of 
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noise  as  observed  by  measurements  on  the  head  surface.  For  ease  of 
implementation  on  a  digital  computer,  only  linear  discrete  time  models 
will  be  considered. 


First,  we  consider  the  VER  model.  An  efficient  means  of  characteri¬ 
zing  a  sample  waveform  from  an  ensemble  of  statistically  nonstationary 
waveforms  in  terms  of  a  set  of  parameters  a.  is: 

M  1 


where 


y(n)  =  y(n)  +  E  (n)  +  e(n);  n=l,2,...,N0  (M.l) 

i=l  1  1  R 

_  x. 

1)  y(n)  is  the  average  value  of  the  waveforms  at  the  ntn  sample. 

2)  e(n)  is  the  truncation  error  corresponding  to  M  terms. 

3)  4>^(n);  n  =  1,2,...,N^  are  a  complete  set  of  orthonormal  basis 
functions. 


4)  is  the  number  of  samples  in  the  heartbeat  (assumed  of 
standard  duration). 


The  coefficients  a.  can  be  assembled  into  a  vector  a  called  the  pattern 
vector  of  a  particular  VER.  Of  the  several  techniques  for  generating  the 
desired  basis  functions,  the  method  chosen  is  the  Karhunen-Loeve  expansion, 
which  has  the  following  desirable  properties  (Fukunaqa  (39)): 

1)  It  minimizes  the  expected  value  of  the  error  energy 


nr 

J  *  E{  l  c2( 1)}. 
i*l 

2)  It  maximizes  the  distance  between  independent  samples  from  a 
single  distribution,  as  defined  by  the  scatter  measure 


7 

a 


■  E{|| 


“j  U2|- 


3)  It  minimizes  the  population  entropy,  defined  by 


h  =  -  E{  1  n  p(a)  }, 

where  p(a)  is  the  probability  density  function  of  a. 

4)  The  coefficients  are  statistically  uncorrelated. 
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The  basis  functions  are  determined  as  follows.  Let 


=  [«y(l)«y(2) 

t.  S  [<^(1)^(2) 

E  4  UOM2)  ... 


•  «y(NR)]T 

•  4.1(Nr)]T; 


i  =  1,2,. ..,NC 


e(NR)]‘ 


where  6y(n)  -  y(n)  -y(n).  Then  (M.l)  can  be  expressed  as 


=  l  +  £  =  4>ot  +  e 


(H.2) 


where  £  *  [i-j^  •  ■ 
thsit 

iiT£j 


The  eigenfunctions  are  orthonormal  in  the  sense 


where  6^  is  the  Kronecker  delta  function.  Therefore, 


«  =  £ 


(M.3) 


Let  the  covariance  matrix  of  5^  be 


R  =  E[6^T] 


(M.4) 


where  E(6y)  =  0  by  definition.  Then  the  basis  functions  are  the  eigen¬ 
vectors  of  the  covariance  matrix  R.  A  particular  eigenvalue  is  the  expected 
value  of  the  energy  associated  with  its  eigenfunction. 

If  y ( n )  is  a  sample  function  from  one  of  k  different  stochastic 
processes  ( i . e . ,  generated  by  different  brain  processes),  the  Karhunen- 
Loeve  expansion  is  still  optimal  in  that  it  minimizes  the  mean  residual 
energy  and  the  population  entropy,  when  the  covariance  matrix  is  defined 
appropriately  (Chien  and  Fu  (22)}: 


5  *  T,  "i* 
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where  R  is  the  covariance  of  the  ith  stochastic  process,  which  has  proba¬ 
bility  6f  occurrence  p. . 

For  a  vector  stochastic  process  (we  are  simultaneously  measuring 
different  components  of  the  VER  with  multiDle  electrodes),  the  expansion 
can  be  easily  extended. 

An  example  of  the  use  of  a  Karhunen-Loeve  expansion  for  ECG  data 
is  qiven  in  Figure  M-2.  Three  different  types  of  heartbeats  are  shown  and 
compared  to  a  10™  order  expansion  (RECON).  Since  there  were  200  samples  in 
the  original  data,  the  data  compression  is  a  factor  of  20:1.  Further¬ 
more,  the  reconstruction  error  is  seen  to  be  very  small,  in  this  case 
not  large  enough  to  give  a  different  cardiac  diagnosis. 


PATTERN  CLASSIFICATION 

A  wide  variety  of  techniques  are  available  for  classifying  a  set 
of  feature  vectors.  They  may  be  conveniently  divided  on  the  basis  of 
the  process  used  to  determine  the  location  of  classes  in  feature  space 
(learning).  Supervised  learning  implies  that  all  data  are  labeled 
according  to  class.  In  unsupervised  learning,  the  data  are  unlabeled  and 
classes  are  typically  generated  using  cluster  analysis. 

Many  supervised  learning  techniques  are  available.  However,  the  ones 
which  are  most  widely  applicable  to  relatively  unstructured  data  such  as 
the  VER  are  the  so-called  nonparametric  methods.  Among  these,  the  par¬ 
titioning  decision  tree  approach  of  Friedman  (38)  is  particularly 
powerful,  as  well  as  being  ideally  suited  to  computer  implementation. 

This  method  will  construct  decision  trees  to  any  arbitrary  accuracy  on 
the  training  set  and  can  handle  multiple  classes  in  a  straightforward 
manner.  It  is  presently  b^inq  applied  to  classification  of  ECGs  by 
Scientific  Systems,  Inc. 

Cluster  analysis  might  be  particularly  useful  in  the  early  stages 
of  investigation  of  the  properties  of  the  VER.  By  using  this  technique, 
it  may  be  possible  to  gain  insight  into  the  structure  of  the  data  and 
determine  whether  the  VERs  tend  to  f  11  into  distinct  types.  The  most 
popular  and  generally  applicable  clustering  techniques  are  iterative 
ones,  using  similarity  measures  between  points  in  feature  space,  and 
employing  hierarchical  or  nearest-neighbor  decision  rules.  A  good  review 
is  given  by  Ball  (8).  More  recently,  the  use  of  fuzzy  set  theory  has 
been  proposed  for  unsupervised  learning,  in  order  to  eliminate  the  neces¬ 
sity  of  using  zero-one  membership  functions  (a  point  is  either  in  the 
class  or  not).  This  work  has  led  to  a  class  of  "Fuzzy  ISODATA"  al¬ 
gorithms  (Bezdek  (11))  which  are  easily  implementable  and  are  particularly 
applicable  tc  problems  in  which  there  are  smooth  transitions  from  one 
class  to  another  (e.g.,  slightly  overlapping  classes).  Since  the  results 
are  completely  data-dependent,  it  is  not  possible  to  present  results  or 
even  predict  the  outcome  of  using  such  techniques  on  the  VER.  However, 
they  do  represent  the  most  generally  powerful  approach  to  data  clustering 
and  thus  have  potential  for  VER  analysis. 
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Figure  M-2.  Illustrating  the  data-compression  capability  of  the  Karhunen- 

Loeve  expansion,  EC6  data  with  variable  morphology.  Compression 
ratio  20:1. 


APPENDIX  N 


A  SET  OF  RECOMMENDATIONS  FOR  SEVERAL  PROBLEM  AREAS 

The  recommendations  given  below  are  derived  from  our  analysis  of 
the  pertinent  literature,  especially  literature  concerning  the  evidence 
for  certain  properties  of  EEG/VER  signals  and  literature  on  experimental 
procedures  and  considerations.  The  recommendations  fall  into  the  following 
groups:  signal  analysis,  stimulus  design/experiment  design,  choice  of 
electrode  location,  medical/psychophysical,  and  future  aspects.  Within 
each  group  we  followed  (tentatively)  some  ordering  corresponding  to  the 
apparent  increase  in  complexity  of  the  recommendation  and/or  decreasing 
expected  pay-off  in  terms  of  reduced  variability. 


SIGNAL  ANALYSIS 

1.  Incoroorate  information  from  frequencies  other  than  the  funda¬ 
mental  pattern  reversal  rate.  Especially  harmonics  and  fre¬ 
quencies  between  harmonics  should  be  considered  in  the  development 
of  measures  of  visual  performance.  To  distinguish 

between  experimental  conditions  (e.g.,  preflash  vs.  postflash), 
learning  schemes  such  as  Friedman  algorithm  (38)  [developed  at 
the  Stanford  Linear  Accelerator  Center  for  similar  purposes], 
which  selects  by  itself  the  important  features  (given  a  set  of 
measures)  could  be  useful. 

2.  Variability  in  itself  should  not  be  regarded  as  adverse. 
Variability  itself  may  provide  a  measure  of  visual  performance 
as  concluded  from  studies  by  Ciganek  (25)  and  recently  by 
Callaway  (20). 

3.  The  general  techniques  for  signal  analysis  as  described  in 
Appendixes  C-G  should  be  applied.  The  importance  and  power  of  the 
use  of  these  (statistical)  techniques  are  demonstrated  in  the 
recently  published  report  by  Chapman  et  al .  (21)  which  used 

a  principal  component  analysis  for  the  simultaneous  extraction 
of  known  (well  established)  and  discovery  of  new  VER  properties. 


EXPERIMENT  DESIGN 

1.  The  overall  performance  of  the  data-acquisition  system  has  to  be 
verified.  By  that  we  mean  to  set  up  experiments  (complete  simula¬ 
tion  including  TV)--possibly  with  a  dummy  subject  (some  conductive 
material  fed  with  active  electrodes)  which  generates  known  waveforms. 

These  waveforms  must  be  retrievable  truthfully  from  the  collected  # 

data  base  before  any  further  data  collection  on  actual  subjects 
should  be  conducted.  From  time-to-time  the  reliability  of  that 
system  has  to  be  checked. 

2.  The  best  brightness  level  should  be  determined.  Studies  by  Riojs 
and  Wooten  (90,  o.  707)  suqqest  that  very  stable  amplitudes  of 
VER  are  obtained  at  0.3  log  units  of  brightness  above  threshold, 
while  with  increasing  brightness  variability  increases.  Nachimas 
(73,  p.  71)  gives  similar  results  for  cat  (microelectrode  studies). 
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One  way  to  increase  possibly  weak  responses  might  be  to  use  zig-zag 
lines  as  recommended  by  MacKay  and  Jeffreys  (66). 

Use  masking  noise  in  the  experiments  and  verify  its  effectiveness. 
In  most  of  the  literature  this  point  is  stressed. 

Use  stimuli  with  random  intervals.  Investigate  the  importance 
and  variability  of  the  Cl , . . .CII I  (complex)  waveform  (Jeffrey,  in 
Desmedt  (28))  and  PI 0-P1 50  (or  N)i  as  described  above  in  the  section 
for  signal  analysis  recommendations. 

Check  the  importance  of  dipole  reversal  by  pattern  (mirror  image) 
reversal,  similar  to  Jeffrey  (in  Desmedt  (28)). 


ELECTRODE  LOCATION 


The  earlobes  should  not  be  reqarded  as  "ground"  and  should  not 
mutually  be  connected.  Instead  the  differential  voltage  for  the 
left  and  right  side  (e.g.,  inion  vs.  left  or  right  ear)  should 
be  recorded.  When  a  grounding  for  the  subject  is  necessary 
(especially  with  high-input  impedance  amplifiers),  any  point  of 
the  body  may  be  tried.  Care  should  be  taken  in  shielding  of 
cables  and  avoiding  magnetic  loops. 

The  number  of  electrodes  for  recording  should  be  increased. 
Especially  we  are  thinking  of  four  additional  electrodes  slightly 
(2  cm)  above  and  balow,  to  the  left  and  right  of  the  currently 
used  electrode.  Since  analog-to-digital  conversion  rate  may  be 
limited,  the  amplifier  bandwidth  and  sampling  rate  for  A/D  should 
be  reduced  as  necessary. 

Search  for  optimal  electrode  location  individually  for  each  subject. 
Grass  (42)  argues  that  many  investigations  rely  too  much  on 
standard  lead  arrangements.  One  possible  way  to  speed  up  such  a 
search  is  to  use  an  array  (e.g.,  horizontal  linear  array)  which 
is  applied  at  different  locations.  Signals  from  the  data  analysis 
scheme  could  indicate  the  adequacy  of  the  location  and/or  select 
the  "best"  electrodes. 

Monitor  saccades,  eyeblinks,  and  other  muscle  potentials  (jaws, 
heart,  etc. ) . 


MEDICAL-PSYCHOLOGICAL  ASPECTS 

The  use  of  (additional)  drugs  like  the  anxiolytic  diazepam  should 
be  considered  for  the  restrained  animal,  in  order  to  restore  near 
normal  EEG. 

The  subject  should  be  assigned  an  appropriate  task.  As  MacKay 
and  Jeffreys  (66)  point  out,  a  subject  without  a  task  is  not 
in  a  "neutral"  state. 
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FUTURE  POSSIBILITIES 


1.  Electrode  impedance  can  be  calibrated  automatically  and  periodically 
(for  safety  standards,  see  Underwriter  Labs.  Manual  (111)). 

2.  Low-input  impedance  amplifiers  should  be  tried  (Van  der  Zi el  (113)). 

This  approach  may  necessitate  the  above  periodic  recalibration. 

3.  Extension  of  the  currently  used  frequency  band  up  to  higher  fre¬ 
quencies.  This  approach,  if  of  value,  will  probably  require  low- 
input  impedance  amplifiers. 

f 
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